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].  OVERVIEW 


The  study  reported  here  was  conducted  in  response  to  a  contract  that 
set  the  following  objective:  "Utilizing  available  climatological  cloud 
property  data  bases  from  a  representative  selection  of  climates,  deter¬ 
mine  the  information  content  in  a  series  of  simulated  upper  air  in  situ 
measurements  of  cloud  presence  along  a  path  to  provide  for  estimates  oi 
mesoscale  cloud  cover,  tops,  and  bases  from  the  earth's  surface  to  8  km 
above  q round  level." 

The  general  context  of  the  problem  is  a  battlefield.  The  specific 
scenario  is  a  tactical  target  area  50  km  across.  The  cloud  probe  is  a 
simple  binary  sensor,  capable  of  reporting  only  "I  am  i  it  cloud"  or  ”1  am 
in  clear  air."  Its  reports  are  telemetered  back  to  a  ground  station  at 
the  rate  of  1  Hz.  The  sensor  is  carried  aboard  an  automatically  piloted 
vehicle  (AI’V)  that  is  limited  in  range  and  speed  but  is  capable  of  exe¬ 
cuting  a  prescribed  flight  path.  Its  position  is  known  at  all  times. 

Figure  1.1  illustrates  the  three  general  classes  of  sampling  pat  - 
terns  that  were  examined.  In  the  first,  the  target  volume  is  sampled 
through  a  succession  of  horizontal  patterns  that  are  stepped  in  altitude, 
each  horizontal  sample  consisting  of  measurements  taken  along  a  single 
pass.  The  second  pattern  is  identical,  except  that  the  horizontal  sample 
is  now  taken  along  a  flight  path  that  is  more  than  a  single  pass.  The 
third  pattern  consists  of  alternate  ascents  and  descents  in  a  tight  spi¬ 
ral,  the  result  being  that  each  level  is  sampled  in  a  pattern  of  widely 
separated  points. 

To  evaluate  the  "sampling  accuracy"  of  these  patterns,  i.o.,  the  ac¬ 
curacy  of  inferences  based  on  the  samples,  we  adopted  two  independent  ap¬ 
proaches.  The  experimental  approach  was  founded  on  a  set  of  112  actual 
cloud  fields  as  observed  from  a  Geostat ionary  Operational  Environmental 
Satellite  (GOES).  Computer  programs  enabled  simulated  sampling  patterns 
to  be  flown  through  these  fields.  Analysis  of  these  led  to  estimates  ot 
the  sampling  accur.u-y  of  the  various  patterns.  Details  of  the  basic  cloud 
fields  and  their  collection  are  described  in  Appendix  A. 

In  parallel,  theoretical  estimates  of  sampling  accuracy  were  derived 
from  the  binomial  distribution.  This  distribution  is  known  to  depict  ac¬ 
curately  the  statistical  properties  of  a  collective  of  binary  samples, 
which  is  what  a  sot  of  our  yes/no  cloud  measurements  comprises.  The  sole 


cornel  L  cat  ion  is  that  the  binomial  dist  ribut  ion  is  valid  only  for  nilUv- 
tives  of  independent  samples,  whereas  the  successive  and  nearby  point 
measurements  in  our  scenario  are  assuredly  not  independent.  In  order  to 
retain  use  of  the  binomial  distribution,  we  introduce  the  concept  of  an 
"independence  fraction,"  which  reduces  the  N  actual  points  of  a  sample 
to  a  statistically  equivalent  sample  consistinu  of  N '  independent  points. 

Appendix  B  elaborates  on  the  use  of  the  binomial  distribution  for 
our  purpose,  while  Appendix  O  discusses  evaluation  of  the  i ndependence 
f  ract ion . 

Besides  the  broad  objective  cited  above,  the  contract  Statement  of 
Work  (SOW)  asks  that  several  pointed  questions  be  answered  for  the  case 
that  the  samplinq  runs  are  straiqht  tine  horizontal  paths.  Then,  the 
Statement  asks  whether  the  best  estimates  of  cloud  parameters  are  achieved 
throuah  this  sampling  mode  or  whether  some  alternate  trajectory  could 
significantly  reduce  uncertainties  in  these  estimates. 

In  Section  11  of  this  report,  the  accuracy  of  samplinq  in  horizon¬ 
tal  passes  is  evaluated,  and  its  dependence  on  the  lenoth  of  the  pass 
established.  There  also,  the  specific  questions  of  the  SOW  are  addressed 
one  by  one. 

Section  3  then  examines  the  accuracy  of  samel ina  in  horizontal  pat¬ 
terns  other  than  straiqht  passes  and  finds  th.it,  for  a  qiven  sampling 
lenqth,  certain  patterns  are  more  effective  than  a  stvaioht  pass. 

Section  4  treats  the  last  of  the  3  classes  of  samp linn  strut euv  — 
the  "vertical"  fliqht  pattern  that  produces  a  confiaur.it  ion  of  isolated 
point  samples  at  each  of  the  horizontal  levels.  Here  it  is  found  that  , 
in  terms  of  the  accuracy  with  which  the  areal  cloud  fraction  can  be  in¬ 
ferred,  a  pattern  of  relatively  few,  well-positioned  points  is  the  equiv¬ 
alent  of  a  rather  Iona  samnlina  path.  This  is  consistent  with  the  esti¬ 
mate  of  independence  fraction  ot  a  "continuous"  horizontal  sample,  which 
suqqests  that  the  information  content  of  a  10-km  seamen t  is  no  more  than 
that  of  its  two  end  points. 

Finally,  in  Section  5  the  best  tit  horizontal  and  vertical  samplinq 
patterns  are  competed  in  terms  of  time  ami  fuel  required  for  execution. 

It  is  found  that,  for  a  qiven  accuracy,  the  best  horizontal  strateqy 
costs  almost  twice  as  much  as  the  best  vertical  strateqy.  Clearly,  the 
waste  of  time  and  fuel  resultinq  from  the  redundancy  of  information  in 


horizontal  sampling  more  than  compensator.  :or  the  higher  rate  o:  fuel 
consumption  entailed  in  the  "pogo-st  ick"  flight  path  of  the  vertical  pat¬ 
tern. 

Besides  the  aforementioned  advantage  of  the  vertical  strategy,  it  is 
vastly  superior  to  the  horizontal  witli  respect  to  fixing  cloud  base  and 
top  and,  consequently,  in  recognizing  the  existence  of  discrete  layers. 
This  latter  capability  makes  fxissible  a  confident  answer  to  an  important 
question  that  can  only  be  guessed  at  from  horizontal  samples  —  namely, 
what  is  the  overalL  cloud  fraction  when  more  than  a  single  layer  is  pres¬ 
ent 

The  contract  SOW  invited  us,  first,  to  work  the  overall  problem  as¬ 
suming  a  horizontally  homogeneous  cloud  field  and,  then ,  to  consider  and 
evaluate  the  effects  of  inhomogenei tv .  However,  nei'her  our  experimental 
approacli  nor  the  theoretical  was  made  simpler  by  an  assumption  of  homo- 
goneitv.  Consequently,  the  general  case  was  attacked  from  the  outset, 
and  the  conclusions  are  valid  without  regard  to  degree  of  homogeneity. 
Nevertheless,  Appendix  D  touches  on  the  academic  issue  of  sampling  a  ho¬ 
mogeneous  field  and,  additionally,  discusses  a  realizable  situation  that 
represents,  in  our  view,  the  most  troublesome  form  of  inhomoqeneity . 

Sampling  efficiency  is  found  to  depend  on  the  cl imatological  fre¬ 
quency  of  cloud  amount.  What  is  relevant  is  cloudiness  at  the  level 
being  sampled,  not  total  cloudiness.  It  is  the  latter,  unfortunately, 
that  is  treated  in  standard  climatological  summaries.  To  generate  the 
specialized  statistics  required  for  our  purpose,  a  model  developed  re¬ 
cently  by  I.  I.  Gringorten  of  the  Air  Force  Geophysics  laboratory  (AFGL.) 
was  employed. 

Throughout  the  study,  whether  the  sampling  pattern  is  horizontal  or 
vertical,  what  is  sampled  is  the  "projected  cloud  fraction"  or  "earth 
cover,"  in  distinction  to  "sky  cover,"  which  is  the  fractional  coverage 
as  seen  from  a  point  on  the  ground.  However  the  SOW  poses  its  questions 
in  terms  of  cloud  cover.  Hence,  a  means  of  converting  from  cloud  frac¬ 
tion  to  cloud  cover  is  required  and  is  dealt  with  in  section  6.  There 
it  is  found  that  the  differences  detectable  between  the  two  measures  are 
small  relative  to  the  scatter  in  our  data,  and  we  conclude  that  in  oper¬ 
ational  practice  it  is  better  to  assume  that  sky  cover  is  identical  to 
earth  cover. 
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As  is  commonly  the  case,  .some  ot  the  intermediate  results  achievi 
during  the  course  ot  this  study  were  not  directly  used  in  the  final  i< 
suits.  Nevertheless,  a  few  ot  those  incidental  results  are  possibly  «« 
general  interest  and  are,  therefore,  described  in  the  two  final  append 
eies:  K  and  F. 
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SAMPLING  IN  HORIZONTAL  PASSES 


2.1  Accuracy  of  Cloud  Amount  Estimate  vs.  Length  of  Sampling  Path. 

The  first  of  the  specific  problems  posed  by  the  contract  Statement 
of  Work  (SOW)  was:  "Assume  a  straight  line  vehicle  trajectory  at  a 
given  level  through  a  cloud  deck  and  determine  the  trade-offs  between 
;ut-h  length  and  uncertainty  estimates  in  the  calculation  of  cloud  cover." 
We  approached  this  question  both  experimentally  and  theoretically. 

2.1.1  The  Experimental  Answer. 

The  experimental  basis  of  the  entire  study  is  132  cloud  fields  ob¬ 
served  from  a  NOAA  GOES  satellite  (National  Oceanic  and  Atmospheric  Ad¬ 
ministration,  Geostationary  Operational  Environmental  Satellite).  Details 
of  this  data  base  are  described  in  Appendix  A.  Out  of  the  132  cases,  50 
were  randomly  selected  and  set  aside  as  independent  data  to  be  used  in 
testing  any  conclusions  based  on  the  "development  sample"  of  82  cases. 

Each  of  the  basic  cloud  fields  is  a  rectangular  array  of  binary 
pixels  —  i.e.,  picture  elements  denoting  only  cloud  or  no-cloud  —  cover¬ 
ing  an  area  100x100  km  on  the  earth's  surface.  The  number  of  rows  and 
columns  in  the  array  varies  with  distance  from  the  sub-satellite  point, 
but  the  average  spacinq  in  our  development  sample  is  1.23  km  in  the  N-S 
direction  and  0.82  km  in  the  E-W. 

To  answer  this  first  question  of  the  SOW,  passes  of  various  lengths 
from  10  to  100  km  were  simulated  in  the  observed  cloud  fields.  The  pro¬ 
cedure  will  be  illustrated  for  the  case  of  60-km  passes  in  a  cloud  field 
centered  over  south  central  Tennessee  on  December  29,  1980  and  observed 
at  local  noon.  On  this  occasion  the  100x100  km  cloud  array  consisted  of 
84  rows  and  125  columns,  and  the  cloud  fraction  over  the  entire  array, 
denoted  NA ,  was  0.444.  Along  each  of  the  209  lines  of  the  array  (rows 
and  columns  toqether)  3  bO-km  passes  were  laid  out  symmetrically.  The 
cloud  traction,  NL,  for  each  of  these  627  simulated  passes  was  evaluated, 
and  a  frequency  distribution  constructed.  The  result  is  shown  in  Table 
2.1.  from  this  distribution  an  accuracy  index,  denoted  P(.l),  was  eval¬ 
uated.  P(.l)  is  defined  as  the  fraction  of  the  627  values  of  NL.  fallinq 
within  0.1  of  0.4  (NA  rounded  to  nearest  tenth).  In  the  present  case 
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TABLE  2.1  FREQUENCY  DISTRIBUTION  OF  PASS  CLOUD  FRACTION,  NL. 

NL  (tenth)  012  3  4  5  6  7  8  9  10 

Frequency  (%)  1.2  2.8  7.3  11.0  18.9  23.1  17.5  11.1  5.1  1.5  0 

The  development  sample  of  cloud  fields  yielded  82  such  v.j  I  ues  «>: 
i’(.l).  These  were  qrouped  accord inq  to  value  ol  NA  and  then  avoraqed. 
The  sample  standard  deviation  was  also  evaluated  for  each  class  con- 
taininq  more  than  6  values  of  t'(.l). 

All  told,  this  procedure  was  used  to  qenerate  statistics  lor 
simulated  samplinq  passes  of  6  lenqths:  10,  20,  40,  60,  80,  and  100  km. 
The  results  are  shown  in  Table  2.2.  The  stronq  dependence  of  I’  ( . 1 )  on 
NA  is  strikinqly  evident  in  Fiqure  2.1. 

TABLE  2.2  MEAN  SAMPLING  ACCURACY,  P(.l),  AND  STANDARD  DEVIATION,  0, 
AS  A  FUNCTION  OF  PASS  LENGTH  AND  AREAL  CLOUD 
FRACTION,  NA. 


_10 

km 

_20 

km. 

_40 

km 

60 

km. 

80  km 

100 

km 

-  NA 
(tenths 

P(.l)  o 

iTa 

?(. 

l)  .2 

P(. 

1)  0 

P(.l) 

0 

ZU 

U  £ 

0 

.96 

- 

.97 

- 

.96 

- 

.98 

- 

.99 

- 

.99 

- 

1 

.88 

.03 

.88 

.04 

.90 

.04 

.92 

.03 

.94  . 

03 

.95 

.04 

2 

.24 

.  15 

.  35 

.  13 

.47 

.  10 

.54 

.09 

.61  . 

10 

.70 

.  11 

3 

.21 

.08 

.  31 

.  10 

.43 

.  14 

.54 

.  14 

.58  . 

16 

.64 

.  15 

4 

.  17 

.08 

.25 

.08 

.  35 

.  11 

.45 

.  14 

.53  . 

17 

.61 

.  17 

5 

.17 

.09 

.  26 

.10 

.35 

.14 

.42 

.  14 

.51  . 

14 

.56 

.  13 

6 

.  17 

.10 

.24 

.13 

.33 

.14 

.42 

.14 

.51  . 

14 

.60 

.14 

7 

.20 

.05 

.28 

.09 

.41 

.  12 

.51 

.15 

.59  . 

15 

.65 

.  19 

8 

.20 

.06 

.  33 

.09 

.50 

.08 

.60 

.  11 

.67  . 

12 

.74 

.  12 

9 

.85 

- 

.  83 

- 

.83 

- 

.84 

- 

.84 

- 

.82 

- 

10 

.99 

- 

.99 

- 

1.00 

- 

1.00 

- 

1.00 

- 

1.00 

- 

Unweighted 

Mean 

.46 

.  52 

.59 

.  66 

.71 

.75 

In  view  of  this  dependence,  simple  averaqinq  ot  l'(.l)  across  the 
values  of  areal  fraction  will  not  produce  the  correct  value  of  samplinq 
accuracy  that  can  bo  expected  on  averaqe  when  the  particular  samplinq 
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AREAL  CLOUD  FRACTION  (tenths) 

Samplinq  Accuracy,  P(.l),  as  a  Function  of 
Areai  Cloud  Fraction  for  Three  Pass  Lenqths 


mode  is  applied  on  a  day-to-day  basis  or  randomly  in  time.  Instead,  the 
averaging  must  be  weighted  by  the  climatological  frequency  of  areal  cloud 
fraction  at  the  level  being  sampled,  which,  of  course,  varies  with  loca¬ 
tion  and  season.  If  the  climatological  frequency  happens  also  to  be  u- 
shaped ,  which  is  a  common  situation,  the  expected  sampling  accuracy  can 
be  dramatically  better  than  an  unweighted  average.  This  is  illustrated 
in  Figure  2.2  which  also  depicts  the  trade-off  between  sampling  pass 
length  and  accuracy. 

The  climatological  frequencies  used  in  Figure  2.2  are  included  in 
Table  B-4  of  Appendix  B.  The  frequencies  for  Fulda  are  for  cloudiness 
in  the  altitude  range  0-3,000  feet,  for  January,  1200-1400  hours  local. 
These  data  were  derived  by  Lund  using  a  technique  devised  by  Gringorten.* 
Strictly  speaking,  the  Ft.  Rucker  data  used  in  Figure  2.2  are  not  ap¬ 
propriate  since  they  relate  to  total  sky  cover,  not  to  cloudiness  at  a 
particular  level.  They  were  used,  nevertheless,  in  order  to  demonstrate 
the  effect  of  a  climatology  that  is  not  so  strongly  U-shaped. 

We  shall  be  using  P(.l)  throughout  this  report,  but  other  investi¬ 
gators  have  employed  the  standard  error  of  estimate  as  their  figure  of 
merit  for  sampling.  No  simple  conversion  exists  between  the  two  mea¬ 
sures,  but  in  Section  4  a  relationship  between  I'(.l)  and  the  standard 
error  of  regression  will  be  shown. 

Figure  2.2  embodies  the  desired  trade-off  between  path  length  and  un¬ 
certainty  in  the  estimate  of  cloud  fraction,  but  several  underlying  fea¬ 
tures  warrant  emphasis: 

A.  The  simulated  passes  were  located  in  almost  all  possible 
positions  within  the  100x100  km  cloud  field.  Consequently  the  values 
of  f’(.l)  represent  the  expected  accuracy  of  a  pass  that  is  randomly 
positioned  in  the  tarqet  area.  In  the  next  section  we  consider  patterns 
that  are  deliberately  positioned  relative  to  the  tarqet  and  find  some 
that  are  more  accurate  than  random  passes  of  the  same  length. 

B.  In  order  to  accommodate  the  longer  passes,  it  was  necessary 
to  deal  here  with  the  entire  100x100  km  cloud  field,  rather  than  a  quad¬ 
rant,  which  is  the  size  specified  for  the  tarqet  area. 

1.  Grinqorten,  I.  I.,  1981:  Climatic  probabilities  of  the  vertical  dis¬ 
tribution  of  cloud  cover.  AFCl.-TN  (in  press). 


SAMPLING  ACCURACY 


1.0 


Figure  2.2.  Average  Sampling  Accuracy,  P(.l),  as  a  Function  of 
Pass  Length. 

A.  Unweighted  Average. 

B.  Average  Weighted  by  Cloud  Frequency  at  Ft.  Rucker 
AL,  in  duly. 

C.  Average  Weighted  by  Cloud  Frequency  at  Low 
Altitude  at  Fulda,  FRG,  in  January. 
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2.1.2  The  Theoretical  Answer. 

If  the  airborne  cloud  sensor  is  sampled  at  the  rate  of  1  Hz,  a  hor¬ 
izontal  pass  yields  a  set  of  point  samples  separated  by  less  than  50 
meters.  Appendix  B  outlines  how  the  binomial  distribution  can  be  used 
to  determine  the  accuracy  with  which  areal  cloud  fraction  can  be  esti¬ 
mated  from  the  cloud  fraction  observed  on  a  set  of  points.  The  sole 
obstacle  to  immediate  application  of  this  theory  in  our  scenario  is  that 
the  theory  calls  for  mutually  independent  samples  whereas,  due  to  the 
sp>at. ial  coherence  oi  cloudiness,  our  closely  spaced  samples  are  not  at 
all  likely  to  be  statistically  independent. 

By-pass inq  this  complication  for  the  moment,  let  us  examine  how 
samplinq  accuracy,  l’(.l),  depends  on  sample  size.  The  data  plotted  in 
Ficjure  2.3(A)  for  samples  of  5,  10,  15,  and  20  independent  points  were 
derived  accordinq  to  the  procedure  of  Appendix  B.  Just  like  the  experi¬ 
mental  values  of  I'(.l),  the  theoretical  values  are  sensitive  to  a real 
cloud  fraction.  Hence,  Figure  2.3(A)  displays  averages  of  P  ( .  1 )  weighted 
by  the  same  climatologies  used  in  the  preceding  paragraph.  Thus,  Fig¬ 
ure  2.2  and  Figure  2.3(A)  are  fully  analogous  and  could  be  directly  com¬ 
pared  were  it  not  for  the  difference  in  abscissas:  "sample  length"  in 
one  case,  "number  of  points"  in  the  other. 

To  rectify  this  incompatibility  we  invoke  the  concept  of  "indepen¬ 
dence  length"  introduced  in  Appendix  c.  This  is  the  distance  of  separ¬ 
ation  that  is  sufficient  to  insure  that  cloud  samples  are  statistically 
independent.  The  average  value  of  this  length  evaluated  on  our  develop¬ 
ment  sample  of  data  is  12.33  km.  This  value  is  used  to  convert  the  sam¬ 
ple  size  (number  of  points)  in  Figure  2.3(A)  into  an  equivalent  length 
of  samplinq  pass.  In  Figure  2.3(B)  the  results  of  this  conversion  are 
[dotted,  toqether  with  the  points  of  Figure  2.2. 

Figure  2.3(B)  offers  an  extended  view  of  the  trade-off  between 
pass  length  and  sampling  accuracy  and  reveals  a  compatibility  between 
the  experimental  and  theoretical  values  of  F(.l). 

2.1.  t  Predict  i  vo  Value  of  the  Sample  Aut  oror  rcl  a  t:  i  on . 

According  to  preceding  paragraphs,  the  accuracy  of  ati  inferred  value 
ot  areal  cloud  tract  ion  depends  on  the  equivalent  number  of  independent, 
points  in  the  linear  sample,  and  this  number  depends  on  the  "independence 
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SAMPLING  ACCURARY  SAMPLING  ACCURACY 


1.0 


PASS  LENGTH  (km) 

i 

Fiqure  2.3.  Averaqe  Sampling  Accuracy,  P(.l),  as  a  Function  of 
Sample  Size  (A)  or  Pass  Lenqth  (B)  . 

A.  Unweiqhted  Averaqe. 

B.  Averaqe  Woiqhted  by  Cloud  Frequency  at  Ft.  Rucker, 

AL,  in  July. 

C.  Averaqe  Woiqhted  by  Cloud  Frequency  at  Low  I 

Altitude  at  Fulda,  FRG,  in  January. 

• - •  From  Theoretical  Model.  p 

A  From  Cloud  Data  (Figure  2.2). 

IB 

)• 

I 

_ j 


length."  It  is  reasonable  to  expect  that  the  independence  length  and, 
therefore,  the  intrinsic  sampling  accuracy  vary  in  value  from  day  to 
day.  Although  the  independence  length  is  evaluated  empirically  in 
Appendix  C,  it  can  be  formulated  theoretically,  and  the  sample  auto¬ 
correlation  function  plays  a  key  role  in  this  formulation. 

This  line  of  reasoning  led  to  an  experiment  to  test  whether  the 
areal  cloud  fraction  can  be  more  accurately  determined  it  both  the  1-1. ig 
autocorrelation  coefficient  of  the  sampling  pass  and  its  cloud  tract  ion 
are  used  as  predictors. 

Each  of  the  82  observed  cloud  fields  was  subdivided  into  50x50  km 
quadrants.  For  each  row  and  each  column  ot  every  quadrant,  the  cloud 
fraction  and  the  1-laq  autocorre lat ion ,  p  were  evaluated  —  also,  *  he 
areal  cloud  fraction  for  the  quadrant.  The  same  was  done  for  the  full 
100x100  km  area.  Altogether,  the  result  was  more  than  .14,000  pa  1 1  s  o* 
line  and  areal  cloud  fractions,  together  with  values  of  p  tot  the  lino. 
These  pairs  were  stratified  by  value  of  0 ^  into  4  classes:  less  than 
0.5,  0. 5-0.9,  greater  than  0.9,  and  all  values.  For  each  class  ot  c 
and  for  each  value  of  linear  cloud  fraction,  the  distribution  ot  areal 
cloud  fraction  was  determined,  along  with  a  variety  of  statist  ics. 

Fiqure  2.4  shows  how  the  mean  areal  cloud  fraction,  NA,  varies 
with  the  linear  fraction,  NL,  for  the  4  classes  of  p  Table  2.1  shows 
how  the  average  sampling  accuracy,  P  < . 1 )  ,  varies  among  the  classes.  In 
both  instances,  the  variation  with  class  is  no  more  than  might  be  export¬ 
ed  as  a  sampling  fluctuation.  We  conclude,  therefore,  that  the  predict¬ 
ability  of  NA  from  NL  is  negligibly  enhanced  by  knowledge  of  . 

2 . 2  Optimum  Number  of  Levels  to  Sam} tie . 

The  second  problem  raised  by  the  Statement  of  Work  was:  "Determine 
the  optimum  number  of  levels  which  the  detector  should  traverse,  consis¬ 
tent  with  restrictions  in  vehicle  range,  in  order  to  character izo  areal 
cloud  coverage." 

The  crux  of  the  matter  here  is  to  strike  the  best  compromise  be¬ 
tween  accuracy  of  cloud  amount  at  each  level  and  assurance  that  no  layer 
goes  undetected.  The  former  is  best  served  by  maximizing  the  pass  length 


at  each  level,  the  latter  by  maximizing  the  number  ot  levels  sampled. 


LIN  LIAR  CLOUD  FRACTION 

I’iqure  2.4.  Areal  Cloud  Fraction  as  a  Function  of  Linear  Fi  action 
for  Various  Classes  of  the  1-Luq  Autocorrelation  of 
tlie  Line  Sample, 


TABLE  2.  3  MEAN  SAMPLING  ACCURACY,  !’  (  .  1  )  ,  AS  A  FUNCTION  OF  LINEAR 
CLOUD  FRACTION,  NL,  FOR  VARIOUS  CLASSES  OF  THE  1-LAG 
AUTOCORRELATION,  p^  OF  THE  LINE  SAMPLE. 

P(.l) 


NL 


(tenths) 

P1<  0.5 

0.5  S  Pl  <  0.9 

Pj  i  0-9 

All  px 

0 

.66 

.63 

★ 

.76 

1 

.66 

.71 

.60 

.63 

2 

.70 

.  66 

.78 

.74 

3 

.65 

.72 

.78 

.71 

4 

.73 

.67 

.47 

.06 

5 

.69 

.59 

.65 

.61 

6 

.73 

.60 

.45 

.61 

7 

.69 

.67 

.71 

.68 

8 

.63 

.65 

.76 

.65 

9 

.57 

.  66 

.54 

.61 

10 

.55 

.66 

* 

.69 

Average 

.66 

.66 

.64 

.67 

In  the  absence  of  foreknowledge  ns  to  the  types  01  clouds  likely  in 
the  target  area,  the  best  strategy  is  to  fly  passes  no  longer  than  10  km 
and  to  sample  as  many  levels  as  possible,  uniformly  distributed  within  the 
altitude  ranqe  of  prime  operational  interest.  The  reasons  are  the  fol¬ 
lowing  : 

A.  As  shown  in  Figure  2.2,  samplinq  accuracy  improves  only 
slowly  with  pass  length,  particularly  in  climates  like  that  of  Fulda. 

B.  An  inference  of  areal  cloud  coverage  is  damaged  far  more 
by  failure  to  detect  a  layer  altogether  than  by  a  degraded  est  imate  of 
its  amount. 

C.  The  frequency  of  cloud  occurrence  is  typically  a  weak  and 
quasi-monotonic  function  of  altitude. 

An  attempt  could  be  made  to  quantify  this  solution,  but  we  chose  not 
to  because  we  doubt  that  there  is  a  "good"  answer  for  the  case  of  horir.on- 
al  sampling  and,  more  important ,  because  the  problem  does  not  even  exist 
for  a  distinct  ly  superior  sampling  mode  that  is  treated  in  Sections  4  and  5. 
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2 . 3  Unce rtainty  of  _ Tops  and  Bases . 

The  third  specific  question  posed  by  the  Statement  of  Work  was: 

"Determine  the  uncertainty  in  the  estimation  of  tops  and  bases  given 
measurements  on  a  straight  line  trajectory  at  multiple  flight  levels." 

In  the  given  circumstances,  the  uncertainty  in  estimate  of  the 
top/base  of  a  cloud  layer  is  half  the  distance  between  sampling  levels. 

A  cloud  top/base  would  be  declared  to  exist  whenever  cloud  is  detected 
on  one  pass  but  not  on  the  next  higher/lower  pass.  To  first  approxi¬ 
mation,  the  median  position  for  the  boundary  is  the  midpoint  between 
the  two  sampling  levels. 

A  pedantic  refinement  of  this  estimate  would  take  account  of  the 
fact  that  whereas  the  existence  of  cloud  at  the  one  level  is  100%  cer¬ 
tain,  it  is  less  than  dead  certain  that  the  other  level  is  clear.  The 
degree  of  uncertainty  depends  on  the  length  of  sampling  pass  and  can  be 
estimated  by  means  described  in  Section  2.1.  This  would  lead  to  biasing 
the  median  estimate  of  the  boundary  altitude  toward  the  level  at  which 
no  cloud  was  detected. 

Again,  it  is  fortunate  that  this  problem  evaporates  in  the  alter-  , 

native  sampling  strategy  recommended  in  Section  5. 
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3.  SAMPLING  IN  HORIZONTAL  PATTERNS 


The  averages  of  sampling  accuracy  presented  in  Section  2.1.1  are 
based  on  all  possible  locations  of  passes,  which  include,  for  the 
100x100  km  area,  passes  that  are  as  remote  as  50  km  from  the  "target." 
Consequently,  as  previously  noted,  the  findings  of  that  section  char¬ 
acterize  the  performance  expected  of  sampling  on  horizontal  passes 
that  are  randomly  positioned  in  the  target  area.  It  is  reasonable  to 
expect  that  centrally  located  passes  might  be  more  representative  of 
the  area.  Also,  for  a  fixed  allocation  of  fuel  to  sample  a  level,  it 
might  be  more  efficient  to  spend  this  on  several  short  passes  rather 
than  in  a  single  long  pass  across  the  area.  With  such  thoughts  in 
mind,  we  designed  a  series  of  experiments  on  the  observed  cloud  fields 
to  test  whether  sampling  in  a  prescribed  horizontal  pattern  is  more 
efficient  than  the  same  distance  of  sampling  in  a  pass  that  is  randomly 
located  in  the  tarqet  area. 

The  patterns  tested  are  shown  in  Figure  3.1.  The  results  to  be 
quoted  for  configurations  #1,  2,  and  3  combine  the  row-patterns  illus¬ 
trated  here  and  the  analogous  column-patterns  which  are  not  shown.  In 
all  cases  the  area  sampled  is  50x50  km.  The  samplinq  length  is  100  km 
for  all  patterns  except  for  #1  whose  length  is  150  km.  Except  for  the 
closed  pattern  of  #5,  the  total  flight  path  would  have  to  be  longer 
than  the  sampling  path  in  order  to  link  the  passes.  In  configurations 
#1  and  #3  the  sampling  passes  subdivide  the  area  respectively  into  4 
and  3  equal  parts.  #2  is  an  example  of  "equal-area"  sampling,  which 
will  also  play  a  key  role  in  Section  4.  To  construct  #2,  the  area  was 
first  divided  into  halves;  then  a  sampling  pass  was  made  through  the 
middle  of  each  half.  The  square  sampling  pattern  in  #5  measures  25x25 
km. 

For  the  test,  the  linear  cloud  fraction  was  measured  for  each  of 
the  8  configurations  for  all  4  quadrants  of  each  of  the  82  100x100  km 
cloud  fields.  (There  are  8  configurations,  rather  than  5,  because  the 
row  and  column  variants  of  numbers  1,  2,  and  3  were  treated  separately 
at  this  stage.)  Thus,  for  each  configuration  there  were  328  pairs  of 
linear  and  areal  cloud  tractions.  These  were  stratified  according  to 
linear  fraction  in  tenths,  and  a  frequency  count  was  made  for  the  areal 
fraction.  An  example  of  the  resulting  matrix  is  shown  in  Table  3.1. 
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TABLE  3.  1 


FREQUENCY  (t)  OF  AKKAl,  CLOUD  FRACTION,  NA,  AND 
SAMPLING  ACCURACY,  l'(.l),  AS  A  FUNCTION  OF  LINEAR 
CLOUD  FRACTION,  NL,  FOR  CONFIGURATION  NO. 


N  IS  THE  NUMBER  OF  CASES. 


n\na  1 

SL\!  1 

0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10  1 
_ 1 

N  1 

1 

- — - - i 

P  ( .  1) 

0 

65 

28 

7 

0 

0 

0 

0 

0 

0 

0 

"ol 

43 

.  93 

1 

5 

47 

34 

11 

3 

0 

0 

0 

0 

0 

0  i 

38 

.86 

2 

0 

11 

40 

40 

9 

0 

0 

0 

0 

0 

0  i 

• 

35 

.91 

3 

!  o 

3 

18 

41 

38 

0 

0 

0 

0 

0 

0 

34 

.  97 

4 

0 

0 

0 

17 

38 

38 

7 

0 

0 

0 

0  ) 

29 

.93 

5 

0 

0 

0 

16 

26 

32 

21 

5 

0 

0 

0 

19 

.  79 

6 

0 

0 

0 

0 

12 

21 

39 

24 

3 

0 

o  ! 

33 

1  .84 

7 

0 

0 

0 

0 

3 

6 

29 

55 

6 

0 

0 

31 

I  .90 

8 

0 

0 

0 

0 

0 

0 

13 

33 

46 

8 

0 

!  24 

1  .87 

9 

0 

0 

0 

0 

0 

0 

5 

5 

35 

50 

5 

j  20 

1  . 90 

10 

0 

0 

(' 

0 

0 

0 

0 

0 

9 

50 

41  i  22 

AVERAGE 

j  .  91 

:  .  892 

(Because  of  rounding  to  integral  values  of  percent,  the  !  i  egm-nc i  •  s 
not  always  sum  to  100.) 

From  these  distributions  the  sampling  accuracy,  P(.l),  was  evaluated 
for  the  11  values  of  linear  fraction,  and  these  were  then  averaged 
(without  weighting  for  climatological  frequency) .  Finally,  the  row 
and  column  results  for  configurations  ??  1 ,  2,  and  3  were  overused. 

The  final  results  are  displayed  in  Table  t.2.  The  best,  accuracy 
is  achieved  by  pattern  #1,  but  it  is  LOT  longer  than  the  others.  Amo 
the  100  km  patterns,  #2  is  the  best.  As  previously  noted,  it  is  the 
euual-area  pattern.  In  Section  4  it  will  be  seen  that  equal-area  con 
figurations  are  the  most  efficient  for  point  sum)  ling  also.  *’>  is 
accurate  than  #2  or  ft  3 ,  but  it  is  more  economical  of  fuel. 

TABLE  3.2  AVERAGE  SAMPLING  ACCURACY,  P(.l),  OF  HORIZONTAL 
PATTERNS. 


(Configurations  1,  2,  and  3  include  passes  along 
columns  as  well  as  the  row  patterns  shown  in 
Figure  3.1.) 


CONFIGURATION  # 

LENGTH 

I'j’.T) 

1 

1  50  km 

.972 

2 

100 

.  960 

1 

too 

.  9  8) 

4 

100 

.  80  5 

5 

100 

.  802 

2’> 


As  speculated  in  the  introduction  to  this  section,  all  of  the  pat¬ 
terns  here  are  more  effective  for  sampling  than  the  random  passes  treat¬ 
ed  in  Section  2.  The  unweighted  average  of  P(.l)  for  100  km  in  Figure 
2.2  is  .751,  but  this  is  an  unfair  comparison  because  the  area  involved 
there  is  100x100  km,  whereas  here  a  100-km  linear  sample  is  used  to 
specity  the  cloud  fraction  for  a  50x50  km  area.  A  more  apt  comparison, 
albeit  somewhat  artificial,  would  bo  with  the  accuracy  of  a  200-km  ran¬ 
dom  pass  for  the  100x100  km  area.  An  estimate  of  this,  from  the  theo¬ 
retical  results  shown  in  Figure  2.3,  is  .871. 

The  principal  findings  of  this  section  are: 

A.  When  horizontal  passes  are  used  to  specify  the  cloudiness 
over  a  tarqet  area,  it  is  more  efficient  to  pattern  and  position  the 
passes  than  to  sample  along  paths  randomly  located  in  the  target  area. 

B.  For  a  fixed  total  length  of  sampling  path,  an  "equal- 
area"  pattern  is  more  effective  than  the  others  examined  here. 


SECTION  4.  SAMPLING  IN  VERTICAL  PATTERNS 

The  comparisons  drawn  in  Section  2.1.2  highlight  the  fact  that 
horizontal  sampling  is  intrinsically  inefficient,  owing  to  the  strong 
autocorrelation  of  cloudiness.  For  instance,  sampling  along  60  km 
has  the  predictive  power  of  only  5  independent  point  samples.  The  time 
and  fuel  spent  traversing  the  distance  between  [joints  that  are  12  km 
apart  are  effectively  wasted. 

This  observation  inspired  a  series  of  experiments  to  measure  the 
relative  efficiency  of  point  samples  in  regular  patterns,  such  as  wou 
result  if  the  target  volume  were  sampled  in  tight  vertical  spirals  con¬ 
nected  by  short  horizontal  segments  at  top  or  bottom,  as  illustrated  in 
Figure  1 . 1 (C) . 

The  15  configurations  tested  are  shown  in  Figure  4.1.  For  all, 
the  area  is  a  50x50  km  quadrant  of  the  basic  cloud  field;  hence  there 
are  4  x  82  =  328  cases  for  each  configuration.  Corresponding  to  config¬ 
urations  #1-5  there  are  half-scale  counterparts  (#6-]0)  which  are  not 
illustrated.  In  these,  the  identical  pattern  is  arrayed  over  the  co-cen- 
tered  square  measuring  25x25  km.  In  all  instances,  including  the  half¬ 
scale  configurations ,  it  is  the  cloud  fraction  of  the  50x50  km  area  that 
we  are  trying  to  specify  from  the  cloud  fraction  of  the  point  sample. 

Patterns  #12,  13  and  14  are  "equal-area"  in  the  sense  introduced 
in  the  preceding  section.  Each  sample  point  is  the  center  of  one  of  the 
N  equal  squares  into  which  the  quadrant  is  subdivided.  As  we  shall  be 
seeing,  these  are  the  most  efficient  of  the  configurations  tested. 

For  each  configuration,  there  are  328  pairs  of  point-sample  cloud 
fraction,  NP,  and  areal  cloud  fraction,  NA.  These  were  stratified 
according  to  the  point  fraction  in  tenths,  and  a  frequency  distribution 
was  constructed  for  the  areal  fraction.  A  typical  result,  that  for 
configuration  #13,  is  shown  in  Table  4.1.  From  these  distributions  the 
sampling  accuracy,  P(.l),  was  evaluated  for  the  11  values  of  point  frac¬ 
tion,  and  these  were  then  averaged  without  weighting. 

In  addition,  for  each  configuration  NA  was  linearly  regressed  on 
NP,  using  the  328  pairs  of  values.  Subsequently  the  regression  equations 
were  tested  on  the  independent  sample  of  4  x  50  =  200  cases. 
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TABLE  4.1  FREQUENCY  DISTRIBUTION  OF  AREAL  CLOUD  FRACTION,  NA,  ,V- 

A  FUNCTION  OF  POINT  CLOUD  FRACTION,  NP  ,  FOR  CONI ■'  ]  GlJRAi  1  f  >\ 

NO.  13. 

NP  and  NA  are,  respectively,  point  and  area  tractions  (tenths) 
N  is  number  of  cases;  P(.l)  is  sampling  accuracy  ('.). 


Np\n 

0 

1 

2 

3 

4 

5_ 

6 

7 

8 

9 

10 

N 

J  '±.  1  ) 

0 

71 

29 

0 
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28 

100 
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23 

55 

20 

2 

0 

0 

0 

0 

0 

0 

0 

44 

98 

2 

0 

20 

47 

27 

7 

0 

0 

0 

0 

0 

0 

15 

94 

3 

0 

0 

37 

47 

40 

6 

0 

0 

0 

0 

0 

51 

94 

4 

0 

0 

2 

21 

48 

19 

10 

0 

0 

0 

0 

42 

88 

5 

0 

0 

0 

10 

35 

25 

25 

5 

0 

0 

0 

20 

85 

6 

0 

0 

0 

0 

13 

23 

35 

28 

3 

0 

0 

40 

80 

7 

0 

0 

0 

0 

0 

5 

27 

41 

27 

0 

0 

22 

95 

8 

0 

0 

0 

0 

0 

0 

7 

43 

33 

17 

0 

30 

9  3 

9 

0 

0 

0 

0 

0 

0 

4 

4 

21 

57 

14 

28 

92 

10 

0 

0 

0 

0 

0 

0 

0 

0 

0 

25 

75 

8 

100 

The  results  of  these  analyses  are  plotted  in  Figure  4.2  as  a  func¬ 
tion  of  the  number  of  points  in  the  sampling  pattern.  The  sampling 
accuracy,  P(.l),  is  shown  in  the  upper  half  of  the  figure,  with  the 
points  identified  by  configuration  number.  The  3  equal-area  configura¬ 
tions,  which  are  connected  by  lines,  are  consistently  the  best  in  their 
class. 

In  the  lower  half  of  Figure  4.2  the  standard  error  of  regression 
is  plotted.  Again  the  3  cases  of  equal-area  patterns  are  superior. 

The  errors  shown  here  are  based  on  the  dependent  data,  but  as  can  be 
seen  in  Table  4.2,  there  is  insignificant  difference  between  these  and 
the  standard  errors  based  on  independent  data.  In  fact,  the  two  sets 
of  errors  have  identical  averages,  1.0  tenths. 

The  data  plotted  in  Figure  4.2,  plus  the  correlation  coefficients, 
are  tabulated  in  Table  4.2. 

Results  thus  far  in  this  section  are  based  on  the  realistic  assump¬ 
tion  that  NP  is  given  and  NA  is  to  be  specified.  The  328  pairs  of  values 
for  each  configuration  were  also  analyzed  assuming  NA  qiven  and  NP  to  be 
predicted.  This  was  done  to  facilitate  a  direct  comparison,  in  terms  of 
sampling  accuracy,  between  these  configurations  and  the  same  number  of 
random  points.  The  results  are  qiven  in  Table  4.3,  where  the  columns 


STANDARD  ERROR  OF  REGRESSION  (tenths)  0  SAMPLING  ACCURARY  (%) 
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labeled  H  denote  the  unweighted  average  accuracy  of  random  patterns  as 
derived  from  the  binomial  distribution.  With  only  a  few  exceptions, 
which  are  mainly  for  the  half-scale  patterns,  the  random  patterns  are 
less  efficient  than  all  of  the  systematic  patterns.  This  suggests  that 
none  of  the  full  scale  patterns  suffers  from  loss  of  efficiency  due  to 
spatial  autocorrelation.  That  circumstance  was  to  be  expected  for  all  of 
the  full  scale  patterns  except,  possibly,  #'s  5  and  14,  where  the  points 
are  separated  by  only  10  km.  In  Appendix  C,  the  average  value  of  the 
"independence  length"  is  found  to  be  12  km. 

The  values  of  P(.l)  are  different  between  Tables  4.2  and  4.1  be¬ 
cause  of  the  difference  in  the  underlying  conditional  frequencies:  NP 
being  the  predictor  in  4.2,  NA  in  4.3.  NP  is  seen  to  be  the  more  effi¬ 
cient  predictor  for  the  larger  samples,  NA  for  the  smaller. 

As  mentioned  in  Section  2,  the  standard  error  of  estimate  is  some¬ 
times  used  as  the  index  of  sampling  accuracy.  Table  4.2  lists  both 
P(.l)  and  standard  error  of  regression  for  the  15  configurations.  Figure 
4.  3  reveals  that  the  two  measures  are  well  related  here,  but  there  is 
no  assurance  that  this  relation  is  applicable  to  other  sampling  patterns. 

The  principal  findings  of  this  section  are: 

A.  Point  samples  taken  in  well-distributed  patterns  are  gen¬ 
erally  more  efficient  than  random  point  samples.  Consequently,  the 
binomial  distribution  can  be  used  as  a  conservative  estimator  of  the 
expected  accuracy  of  systematic  patterns  of  point  samples. 

B.  Equal-area  point  patterns  are  the  most  efficient  of  the 
configurations  tested. 
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5.  LOGISTICAL  COMPARISONS  AND  RFCOMML'NDAT  1 ONS 


‘j.l  comparison  of_lloiLzont.il  and  Vertical  Complin.)  I'aiterns. 

In  the  two  preceding  sections  it  has  been  shown  that  the  so-called 
equal-area  configurations,  whether  lines  or  points,  are  the  most  effi¬ 
cient  samplers.  It  remains  to  consider  the  logistical  efficiency  of 
lines  and  points;  that  is,  to  compare  the  fuel  and  time  required  to  exe¬ 
cute  the  horizontal  and  vertical  sampling  patterns  that  are  the  means  of 
achieving  the  line  and  point  configurations. 

The  performance  specifications  assumed  for  the  sampling  platform 
are  shown  in  Table  5.1.  These  were  provided  by  Dr.  Gerald  Seemann, 
president  of  Developmental  Sciences,  Tnc. ,  in  response  to  a  request  for 
generalized  values  that  are  compatible  with  capabilities  of  state-of- 
the-art  automatically  piloted  vehicles  (APV's). 

The  volume  to  be  sampled  is  50  km  across  and  from  1,000  to  10,000 
feet  in  the  vertical.  The  horizontal  samples  are  taken  at  intervals  of 
1,000  feet,  from  top  to  bottom,  for  a  total  of  10  levels. 

The  horizontal  patterns  chosen  for  the  "flyoff"  are  f? ' s  2  and  5  in 
Figure  3.1,  hereafter  referred  to  as  11-2  and  H-5.  Both  yield  a  100  km 
sample  at  each  level.  H-2  is  the  efficient  equal-area  pattern,  but  it 
has  the  drawback  of  requiring  an  extra  leg  of  25  km  to  close  the  pattern 
For  present  purposes  we  ignore  any  additional  information  that  might  be 
gleaned  from  the  bridginq  leg.  H-5  is  intrinsically  less  accurate  than 
H-2  but  requires  no  extra  leg. 

The  vertical  patterns  are  #'s  12  and  13  in  Figure  4.1  —  henceforth 
V-12  and  V-13.  They  are,  respectively,  the  equal-area  9-point  and  16- 
point  eonf igurat ions.  As  illustrated  in  Figure  1.1,  they  are  gener¬ 
ated  by  flyinq  alternate  ascents  and  descents  in  tight  spirals.  A 
standard-rat o  turn  produces  a  radius  of  about  0.65  km  for  the  spiral, 
which  is  treated  here  as  a  vertical  line.  The  spirals  are  linked  at 
top  and  bottom  by  horizontal  legs:  17  km  long  for  V-12,  13  km  for  V-13. 
Again,  we  ignore  the  possibility  of  sampling  on  these  leqs. 

The  results  of  the  flyoff  .ire  posted  in  Table  5.2,  where  the  pat¬ 
terns  are  listed  in  order  ot  decreasing  time  and  fuel  consumption. 
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Because  the  horizontal  patterns  are  flown  from  top  to  bottom,  they  en¬ 
tail  no  ascents.  The  values  of  sampling  accuracy  are  derived  from  Tables 
3.2  and  4.2,  and  are  cited  here  only  for  intercomparison. 


TABLE  5.1 

PERFORMANCE 

POSTULATED  FOR  APV. 

FLIGHT 

AIRSPEED 

VERTICAL  RATE 

FUEL  RATE 

MODE 

(KTS) 

( FT/MI N) 

(LBS/HR) 

HORIZONTAL 

70 

8 

ASCENT 

65 

500 

10 

DESCENT 

70 

2000 

5 

TABLE  5. 

.  2  TIME 

AND  FUEL 

REQUIRED  FOR  VARIOUS  SAMPLING 

STRATEGIES . 

H-2 

H-5 

V- 

13 

V 

-12 

PATTERN 

TIME 

FUEL 

TIME 

FUEL 

TIME 

FUEL 

TIME 

FUEL 

SEGMENTS 

(HRS) 

(LBS) 

(HRS) 

(LBS) 

(HRS) 

(LBS) 

(HRS) 

(LBS) 

HORIZONTAL 

9.64 

77.09 

7.  71 

61.67 

1.45 

11.56 

1.03 

8.22 

ASCENT 

— 

— 

2.40 

24.00 

1.20 

12.00 

DESCENT 

0.08 

0.38 

0.08 

0.38 

0.60 

3.00 

0.  38 

1.88 

TOTAL 

9.72 

77.47 

7.79 

62.05 

4.45 

38.  56 

2.61 

22.  10 

P  (  .  1)  (%) 

96 

89 

93 

81 

As  a  class,  the  vertical  patterns  are  the  indisputable  winners  of 
the  flyoff.  H-2  does  afford  the  best  sampling  accuracy,  but  it  is  only 
slightly  more  accurate  than  V-13  while  it  consumes  more  than  twice  the 
fuel  and  time.  Even  H-5,  which  is  less  accurate  than  V-13,  takes  three- 
quarters  more  time  and  uses  60%  more  fuel.  The  most  economical  of  the 
four  patterns,  V-12,  requires  less  than  60%  of  the  time  or  fuel  consumed 
by  V-13,  but  at  a  sizeable  penalty  in  accuracy.  However,  as  will  be 
seen  below,  even  V-12  can  be  expected  to  achieve  90%  accuracy  on  average 
in  the  real  world. 

5.2  Refined  Estimates  of  Sampling  Accuracy 


There  are  two  reasons  why  the  estimates  of  sampling  accuracy  in 
Table  5.2  are  pessimistic:  A)  they  are  unweighted  averages,  and  B)  they 
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are  for  cloud  levels  rather  than  layers. 


Why  an  unweighted  average  of  P  ( . 1 )  is  an  improper  and,  usually, 
pessimistic  estimator  of  sampling  accuracy  is  discussed  in  Section  2 
and  Appendix  B.  The  correct  estimator  of  the  sampling  accuracy  expec¬ 
table  in  routine  practice  is  a  climatically  weighted  average.  Figure 
B-l  suggests  that  this  estimator  is  strongly  correlated  with  the  degree 
to  which  the  frequency  of  cloud  amount  is  U-shaped. 

Climatically  weighted  averaqes  of  sampling  accuracy  for  the  4  pat¬ 
terns  in  the  flyoff  are  listed  in  Table  5.3.  Again,  Ft.  Rucker  July 
and  Fulda  1-3  Ki-'T  January  are  the  two  cloud  climatologies  used  for  the 
illustration.  Now,  even  the  "cheap"  pattern,  V-12,  averages  90%  in 
accuracy. 

TABLE  5.3  AVERAGE  SAMPLING  ACCURACY. 

P(.l)  (%) 


WEIGHTED 


SAMPLING  PATTERN 

UNWEIGHTED 

FT.  RUCKER 

FULDA 

AVERAGE 

H-2 

96 

98 

99 

98 

H-5 

89 

93 

96 

95 

V- 13 

93 

95 

98 

97 

V-12 

81 

88 

94 

91 

These  values  of  accuracy  are  valid  for  sampling  a  level.  In  prac¬ 
tice,  a  single  cloud  layer  will  be  sampled  at  more  than  one  level. 

This  is  particularly  true  for  vertical  patterns,  in  which  the  levels 
are  as  close  as  33  feet  if  the  sensor  is  sampled  at  the  rate  of  1  Hz. 
The  estimator  for  the  cloud  fraction  of  the  layer  is  the  average  of 
the  cloud  fractions  for  all  levels  sampled  within  the  layer.  Conse¬ 
quently,  the  layer  cloud  fraction  is  normally  more  accurate  than  the 
level  fraction.  Just  how  much  more  accurate  is  a  question  that  could 
be  readily  answered  if  the  level  samples  were  independent,  but  they 
are  not,  owing  again  to  the  spatial  coherence  of  cloudiness.  It  is 
beyond  the  scope  of  our  data  to  appraise  quant itat i vely  the  improve¬ 
ment  due  to  averaqinq  of  interdependent  levels,  but  it  can  be  taken 
with  confidence  that  the  values  in  Table  5.3  are  conservative  estima¬ 
tors  for  cloud  layer;;  of  substantial  thickness. 
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5.3  Other  Advantages  of  Vertical  Sampling  Patterns. 

Besides  the  distinct  superiority  in  fuel  economy,  vertical  sampling 
patterns  have  two  other  significant  advantages  over  horizontal:  greater 
accuracy  both  in  estimates  of  cloud  base/top  and  in  total  cloud  amount. 

In  the  present  scenario,  if  the  cloud  sensor  is  sampled  at  1  Hz, 
cloud  heights  are  determined  from  the  vertical  patterns  to  within  10 
feet  on  ascent,  whereas  there  is  an  uncertainty  of  +  500  foot  in  the 
estimates  of  cloud  base  or  top  derived  from  horizontal  sampling. 

From  vertical  samples,  the  total  cloud  amount  (within  the  volume 
sampled)  is  readily  estimated,  as  simply  the  fraction  of  profiles  on 
which  cloudiness  is  detected  at  any  height.  This  estimate  of  total 
cloudiness,  as  argued  above  with  respect  to  a  layer,  is  at  least  as 
accurate  as  the  estimate  of  cloud  amount  at  a  level,  and  probably  more 
so. 

By  contrast,  estimating  the  total  cloud  amount  from  horizontal 
samples  depends  on  answering  two  sometimes  difficult  questions:  how 
many  cloud  layers  are  present,  and  how  do  they  overlap?  For  sampling 
at  1,000-ft  intervals  it  is  reasonable,  but  not  necessarily  correct, 
to  assume  that  any  cloudiness  encountered  on  adjacent  levels  is  part 
of  a  single  layer.  Multiple  layers  are  declared  to  exist  only  when 
one  or  more  intervening  cloud- free  levels  are  observed.  As  to  the 
overlap  of  multiple  layers,  one  has  little  choice  but  to  assume  the 
cloud  elements  in  separate  layers  to  be  independent  in  their  position¬ 
ing.  Clearly,  both  steps  in  the  process  are  potential  sources  of  sig¬ 
nificant  error.  For  example,  if  5/10  cloudiness  is  detected  at  both 
of  two  levels,  the  estimate  of  total  cloudiness  will  be  either  5/10 
or  8/10  depending  on  whether  the  two  levels  are  assumed  to  be  part  of 
a  single  layer  or  two  layers  are  assumed  to  be  present. 

5 . 4  Recommended  Operational  Procedure  . 

5.4.1  Choice  of  Sample  Size. 

The  unequivocal  finding  of  this  section  is  that  a  vertical,  equal - 
area  sampling  pattern  is  superior.  In  practice,  then,  the  only  choice 
open  is  how  many  points  (profiles)  there  will  be  in  the  pattern.  This 


must  be  decided  us  a  compromise  among  accuracy  desired,  volume  within 
which  the  cloud  parameters  are  to  be  determined,  and  time/fuel  avail¬ 
able  tor  sampling. 

Save  tor  the  possibility  ot  using  a  faster  Al’V,  there  are  only  two 
means  of  reducing  the  time  required  to  sample  the  cloudiness  over  a 
target  area:  A)  reduce  the  altitude  range  sampled;  B)  reduce  the  num¬ 
ber  of  profiles  (points)  sampled. 

Sampling  time  does  not  decrease  linearly  with  reauction  in  alti¬ 
tude  range  because,  even  in  vertical  sampling,  the  horizontal  legs 
account  for  a  non-trivial  fraction  of  the  overall  time.  In  Table  5.2 
the  horizontal  time  is  at  least  1/3  of  the  total  for  V-12  and  V-13. 

Nor  does  the  overall  sampling  time  scale  linearly  with  the  number 
of  points  in  the  pattern.  While  the  time  required  for  the  vertical 
components  is  proportional  to  the  number  of  points,  the  time  spent  on 
the  horizontal  component  scales  almost  like  the  square  root  of  the 
number  of  points.  The  reason  for  this  is  that,  as  the  number  of  points 
decreases  in  an  equal -area  pattern,  the  separation  of  the  points  in¬ 
creases  . 

The  penalty  for  reducing  the  number  of  profiles  sampled  is,  of 
course,  reduced  accuracy  of  the  estimates  of  cloud  amount.  If  the 
sampling  area  is  50  km  across,  then  the  values  of  P(.l)  cited  in  Table 
5.3  are  valid  estimates  for  the  areal  cloud  fraction  at  a  level.  How¬ 
ever,  as  noted  above,  these  are  conservative  estimates  for  the  accuracy 
of  the  cloud  fraction  of  a  layer  or  for  the  accuracy  of  total  cloud 
amount . 

Accuracies  achievable  with  samples  sizes  other  than  9  or  16  points 
can  be  derived  by  interpolation/extrapolation  from  Table  5.3.  For 
this  purpose  Figure  2.3(A)  is  a  useful  guide  even  though  it  depicts 
the  dependence  of  F(.l)  on  sample  size  for  random  point  samples.  In 
Sect  ion  1  it  is  shown  that  random  points  are  a  conservative  estimator 
ot  t  tie  accuracy  for  equal-area  point  patt  erns. 

Likewise,  the  et  loot  on  P(.l)  of  the  climatic  frequency  of  cloud 
amount  can  be  iudqed  from  Fiqure  B-l,  sub-ject  to  recognition  that  the 
absolute  values  of  l'(.l)  in  Figure  B-l  are  tot  a  i()-point  random  pattern. 

5.4.2  Interpretation  ot  Data. 

Once  taken  and  relayed  to  the  qround,  the  data  are  readily  con- 
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vertible  into  the  3  operational  parameter?  desired:  A)  total  cloud 
amount  in  the  tarqet  area,  B)  base  and  top  of  each  cloud  layer  pres¬ 
ent,  and  C)  cloud  amount  in  each  layer. 

The  inference  of  total  cloud  amount  (within  the  altitude  ranqe 
sampled)  is  so  simple  that  it  could  easily  be  derived  by  an  onboard 
counting  circuit.  The  total  cloud  amount  is  nothing  but  the  frac¬ 
tion  of  profiles  on  which  cloudiness  was  encountered  at  any  level. 

Recognizing  the  individual  cloud  layers  and  fixing  the  base  and 
top  of  each  is  probably  most  easily  done  subjectively  from  a  simple, 
side-by-side  plot  of  the  profiles. 

Once  the  base  and  top  of  a  layer  have  been  established,  the  cloud 
amount  for  the  layer  is  merely  the  average  number  of  cloudy  points 
among  all  profiles  and  within  the  altitude  range  of  the  layer.  This 
average  is  the  equivalent  of  estimating  the  cloud  fraction  for  all 
levels  sampled  within  the  layer  and  then  averaging  the  level  fractions. 


G.  CONVERSION  FROM  CLOUD  FRACTION  TO  CLOUD  COVER 


Throughout  the  report  to  this  point  we  have  been  concerned  with 
estimating  projected  cloud  fraction  whereas  the  contract  Statement  of 
Work  asks  for  estimates  of  cloud  cover.  The  two  measures  of  cloud 
amount  may  differ  because  the  projected  fraction,  or  earth  cover,  is 
not  sensitive  to  cloud  thickness  while  the  sides  of  distant  cloud 
elements  do  contribute  to  sky  cover,  the  fractional  obscuration  of 
the  sky  when  viewed  from  a  point  on  the  ground.  Therefore,  one  would 
expect :  A)  the  two  measures  to  be  identical  whenever  the  earth  cover 
is  either  0  or  10  tenths,  and  B)  sky  cover  to  be  somewhat  the  greater 
for  intermediate  values  of  earth  cover. 

To  explore  this  relationship  quantitatively,  we  exploited  an  anal¬ 
ysis  already  performed  on  whole-sky  photographs  taken  in  conjunction 
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with  standard  sky  cover  observations.  These  observations  were  made 
daily  at  0900,  1200  and  1500  CST  over  a  span  of  more  than  3  years,  at 
the  National  Weather  Service  observing  site  in  Columbia,  MO.  In  the 
original  analysis,  2,805  photographs  and  matching  observations  of  sky 
cover  were  used  to  derive  frequency  distributions  of  cloud  cover  by 
sky  cover  and  by  sector  of  the  celestial  dome. 

Table  3  of  the  reference  gives,  as  a  function  of  sky  cover,  the 
cumulative  frequency  of  cloud-free  fraction  in  a  circle  of  50°  angular 
diameter  centered  on  the  zenith.  From  this  was  derived  Figure  6.1 
showing  the  average  sky  cover,  N,  as  a  function  of  N(50),  the  cloud 
cover  of  the  50°  sector.  Since  the  viewing  angle  of  this  sector  de¬ 
parts  so  slightly  from  vertical,  the  cloud  cover  for  the  sector  should 
approximately  equal  the  cloud  fraction ,  NA(50),  of  the  sector.  Initial¬ 
ly,  it  was  our  further  expectation  that,  when  averaged  over  enough 
cases,  the  cloud  fraction,  NA ,  of  the  total  sky  and  NA(50)  should  be 
almost  equal.  However,  Figure  6.1  clearly  invalidates  this  hypothesis, 
for  it  would  mean  that  total  cloud  fraction  exceeds  total  sky  cover  in 
the  upper  range  of  cloud  fraction.  A  scale  phenomenon  is  responsible 
for  fne  fact  that,  even  on  average,  NA(50)  and  NA  are  unequal.  (Not 
including  the  extreme  points  —  N(50)  =  0  and  N(50)  =  jo  —  which  account 

2.  Lund,  1.  A.,  D.  D.  Grantham  and  R.  E.  Davis,  1980:  Estimating 

probabilities  of  cloud-free  f ie lds-of-view  from  the  earth  through 
the  atmosphere.  journal  of  Applied  Meteorology,  19:  452-463. 
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SKY  COVER  (tenths) 


Figure  6.1.  Mean  Sky  Cover  and  Standard  Deviation  as  a 
Function  of  Cloud  Cover  of  the  Central  50° 
Sector . 
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tor  75%  ot  the  cases,  each  point  in  Figure  6.1  is  based  on  an  ave rage 
of  just  under  80  cases.) 

The  existence  01  this  scale  phenomenon  is  most  easily  recognized 
in  the  extreme  cases.  Consider  the  cases  for  widen  the  narrow  over¬ 
head  sector  is  clear.  in  at  least  some  of  these  cases  there  must  t«- 
cloudiness  elsewhere  in  the  sky.  Consequently  the  average  ot  N  for 
these  cases  must  be  greater  than  0.  At  the  other  extreme,  the  avonoc 
ot  N  for  all  cases  in  which  NA(50)  =  1  must  be  less  than  1.  The  mag¬ 
nitude  of  the  phenomenon  probably  depends  on  the  size  ol  the  sector, 
which  is  10%  of  the  total  sky  for  the  50°  sector. 

To  quantify  this  scale  phenomenon,  we  return  to  our  own  cloud 
data.  Figure  6.2  shows  how  the  cloud  fraction,  NA,  of  the  100x100  km 
total  area  varies  with  NA (Q) ,  the  cloud  fraction  of  the  quadrant. 

This  plot,  which  is  based  on  the  82  cases  of  the  development  sample 
of  cloud  fields,  confirms  the  expected  shape  of  the  relationship.  The 
line  of  regression  is 

NA  =  1.18  +  0.735  NA  (Q)  (6.1) 

with  a  correlation  coefficient  of  0.99. 

Despite  the  difference  in  sector  size  —  25%  for  the  quadrant  vs. 

10%  for  the  50°  sector  —  we  assume  that  Eq.  (6.1)  holds  for  the  de¬ 
pendence  of  NA  on  NA ( 50) .  This  enables  Fiqure  6.1  to  be  converted 
into  Fiqure  6.3  depicting  sky  cover  as  a  function  of  earth  cover  — 
just  the  relationship  that  we  have  been  seeking.  The  ±  1/10  confidence 
limits  for  earth  cover  are  also  shown,  and  the  sigma-bars  in  Figure 
6.1  apply  to  the  ordinates  of  the  points  in  Figure  6.3. 

Figure  6.3  shows  that  sky  cover  tends  to  exceed  earth  cover  by  a 
small  amount,  particular ly  for  scattered  cloudiness.  However,  the  dif¬ 
ference  is  erratic  and  is  comparable  in  magnitude  to  the  noise  level 
of  the  data.  Therefore,  we  feel  that  in  practice  there  is  no  basis 
here  for  drawing  a  distinction  between  sky  cover  and  earth  cover.  In 
other  words,  the  projected  cloud  fraction  for  the  area  as  derived  from 
the  line  or  point  samples  discussed  in  earlier  sections  should  be  tak¬ 
en  as  equal  to  the  corresponding  cloud  cove r . 


NA  (tenths) 


Figure  6.2.  Cloud  Fraction  of  Total  Sky  (NA)  as  a  Function  of 
Cloud  Fraction  of  a  Quadrant,  NA(Q). 

-  Line  of  Regression. 

-  450  Line. 


SKY  COVE. 


EARTH  COVER  (tenths) 


Fiqure  6.3.  Sky  Cover  as  a  Function  of  Earth  Cover. 

Lines  Denote  Confidence  Limits  of  Earth  Cover. 


44 


Al’I’KNDl  X  A 


observed  ceoud  kieijjs 

Me I DAS  (Man-Computer  Interactive  Data  Access  System)  is  a  mini¬ 
computer-based  system  developed  at  the  University  of  Wisconsin  and 
designed  for  the  gathering  and  display  of  meteorological  data.  Among 
its  capabilities  are  the  analysis  and  contouring  of  conventional  mete¬ 
orological  parameters  (temperature,  pressure,  vortieity,  streamlines, 
etc.),  the  plotting  of  temperature  and  moisture  soundings,  and  the 
depiction  of  satellite  imagery  (visible  and  infrared).  blotting  rou¬ 
tines  are  prompted  via  the  terminal  keyboard  and  displayed  in  color  on 
a  CRT. 

As  an  interactive  system  McIDAS  is  a  very  powerful  tool  that  can 
be  put  to  many  varied  uses.  It  is  especially  well  suited  for  the  in¬ 
vestigation  of  clouds  since  both  conventional  data  and  satellite'  pic¬ 
tures  are  available  for  scrutiny. 

The  cloud  data  of  interest  in  this  report  were  "half-mile"  reso¬ 
lution  visible  imagery.  In  particular  the  goal  was  to  acquire  a  rep¬ 
resentative  cross  section  of  samples  of  sinqle  layer  clouds  that 
covered  the  spectrum  from  nearly  clear  to  nearly  overcast  sky  condi¬ 
tions. 

Data  collection  commenced  on  December  10,  1980  and  continued  un¬ 
til  May  18,  1981.  Since  some  satellite  images  had  been  archived  on 
Betamax  tapes,  it  was  possible  to  obtain  samples  from  as  far  back  as 
May  6,  1980.  In  all,  132  separate  cases  were  selected. 

The  usual  operating  procedure  was  to  examine  a  satellite  photo¬ 
graph  of  low  resolution  in  the  mortiinq  in  order  to  identify  areas  where 
sinqle  layer  clouds  were  located.  During  the  cold  weather  months  care 
had  to  be  exercised  to  insure  that  areas  where  snowcover  was  present 
were  not  selected  for  sampling.  This  precaution  was  taken  to  prevent 
the  possibility  of  ambiguity  between  reflections  from  a  snow  nurture 
and  those  from  a  cloud  top.  Table  A- 1  shows  tin-  locations  at  which 
the  cloud  samples  were  taken.  A  large  proportion  of  the  cases  was 
taken  from  the  southeastern  United  States  because  of  the  snoweovet 
problem  and  the  fact  that,  since  other  utters  ot  McIDAS  wet  e  mainly  in¬ 
terested  in  the  eastern  half  ot  the  country,  most  satellite  pictures 


TABLE  A- 1 .  LOCATIONS  OF  SAMPLES. 


Location  Number  of  Samples 

Florida  25 
Virginia  S 
North  Carolina  9 
Alabama  7 
Georgia  6 
Pennsylvania  5 
Texas  5 
Louisiana  4 
Colorado  4 
Tennessee  3 
Mississippi  3 
South  Carolina  3 
Ontario,  Canada  3 
South  Dakota  2 
Michigan  2 
Ohio  2 
New  Mexico  2 
New  Hampshire  2 
Minnesota  2 
Montana  1 
Alberta ,  Canada  1 
Idaho  1 
Kentucky  1 
West  Virginia  1 
Northern  Mexico  1 
Nevada  1 
New  York  1 
Nebraska  1 
Maine  1 
Quebec,  Canada  1 
Florida  Keys  1 
Illinois  1 
New  Jersey  1 
Massachusetts  1 
Arkansas  2 
Florida  coastal  waters  7 
North  Carolina  coastal  waters  4 
Gulf  of  Mexico  3 
New  Jersey  coastal  waters  1 
South  Carolina  coastal  waters  1 
Lake  Ontario  1 


TOTAL 


were  centered  there. 

Once  a  general  area  was  selected  from  the  low  resolution  satel¬ 
lite  photograph,  a  higher  resolution  {nominal  half-mile)  image  was  in¬ 
gested  into  Me  I DAS.  Since  resolution  is  a  function  of  distance  from 
the  sub-satellite  point,  the  actual  resolution  varied  from  .020-1.  M<> 
miles  in  the  north-south  direction  to  .471-. 860  miles  in  the  east -west 
direction  within  the  cases  considered.  From  the  high  resolution  image 
an  array  of  2  35x135  pixels  was  flicked  and  this  constitute"!  a  sing  It' 
sample . 

Local  noon  was  often  chosen  as  sampling  time  so  that  the  sun  would 
be  high  in  the  sky  and  ground-cloud  contrast  would  be  a  maximum.  K.ich 
pixel  in  the  satellite  image  has  a  brightness  (watts  cm  ^  steradian  ') 
associated  with  it.  After  adiustment  for  calibration  each  pixel  was 
labelled  with  a  brightness  count  in  the  range  0-255.  Since  the  purpose 
of  obtaining  the  satellite  data  was  to  produce  a  field  that  explicitly 
depicted  the  location  of  cloud  elements,  a  threshold  brightness  for 
cloud  designation  had  to  be  chosen.  Any  pixel  with  a  brightness  above 
the  threshold  was  designated  as  cloud.  A  value  of  100  counts  was  most, 
often  chosen  as  the  brightness  threshold  between  cloud  and  no  cloud 
although  the  range  was  from  50-100  counts.  Threshold  brightness  de¬ 
pended  on  time  of  day  of  sampling  (sun  angle)  and  type  and  thickness 
of  cloud  (ref lectivity) .  Printed  output  of  x's  for  clouds  and  blanks 
for  no  clouds  was  produced  for  each  sample  at  the  time  of  selection. 

There  is  no  assurance,  of  course,  that  the  subjectively  chosen 
threshold  value  of  brightness  precisely  defines  the  cloud  boundary. 
However,  since  our  purposes  require  only  realistic  patterns  of  clouds 
and  spaces,  it  matters  little  whether  the  assumed  cloud  boundary  in  a 
particular  instance  lies  somewhat  inside  or  outside  the  actual  bound¬ 
ary. 

Since  visual  inspection  of  the  satellite  imago,  alone,  is  not  suf¬ 
ficient  to  guarantee  that  the  clouds  wore  restricted  to  a  single  layer, 
several  other  sources  of  information  were  consulted.  These  included 
Circuit  "A"  reports,  synoptic  charts  and  use  ot  Me  IDAS  to  overlay  sur¬ 
face  weather  observation  on  t he  satellite  image.  Despite  these  precau¬ 
tions  there  still  can  be  no  assurance  that  all  the  cloud  c.ami  les  were 
restricted  to  a  single  layer. 
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Once  a  particular  sample  was  selected,  McIDAS  wrote  the  bright¬ 
ness  distribution  in  the  area  of  interest  onto  maqnetic  tape.  Further 
processing  required  use  of  a  large  mainframe  computer  (CDC  6600) ,  for 
which  software  was  written  for  data  conversion  and  unpacking.  This  in¬ 
cluded  a  short  program  which  calculated  the  distance  between  pixels  and 
then  output  the  number  of  pixels  necessary  so  that  each  sample  area 
would  be  approximately  100  km  x  100  km. 

Upon  examination  of  the  entire  cloud  field  (via  the  printouts)  a 

suitable  sub-area  was  chosen  such  that  the  correct  number  of  pixels 
2 

for  the  100  km  area  would  be  analyzed.  The  decision  as  to  which  sub- 
area  to  choose  was  arrived  at  subjectively.  Additional  software  was 
developed  for  the  data  analysis. 


i 
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APPENDIX  B 


THKORKT1  CAL  SAMPLING  MUDK1. 

Visualize  a  cloud  layer  covering  some  fraction,  NA,  of  the  target 
area.  Suppose  that  some  level  within  the  cloud  is  sampled  at  a  random¬ 
ly  positioned  point.  The  probability  that  this  point  is  in  cloud  is  NA , 
the  chance  of  its  being  in  clear  air  is  {1-NA). 

Now  suppose  that  the  level  is  sampled  at  n  random  points.  The 
probability  that  all  points  are  in  cloud  is  (NA)n,  that  all  are  in 
clear  air  (l-NA)n.  The  probability,  P(N),  that  any  intermediate  number, 
N,  of  the  n  points  is  in  cloud  is  given  by  the  binomial  distribution: 

P(N)  *  ntHh-IT)'!  <NA)N(3-NA)n-N.  (B-l) 

In  Table  B-I  is  shown  the  binomial  distribution  of  frequencies 
(probabilities)  corresponding  to  n  =  10  and  NA  =  .444.  This  distribu¬ 
tion  is  labeled  FI.  The  distribution  labeled  F2,  which  was  transferred 
from  Table  2.1,  characterizes  the  cloud  fraction  of  60-km  passes  in  one 
of  our  observed  cloud  fields.  NA  =  .444  for  F2  also.  The  comparison 
is  shown  merely  for  general  interest.  There  is  no  reason  to  expect 
close  agreement  between  the  two  distributions. 

TABLE  B-l .  FREQUENCY  (F)  OF  CLOUD  FRACTION  (NP)  OF  A  SAMPLE. 

FI  is  the  binomial  distribution  for  a  set  of  10  points. 

F2,  which  was  taken  from  Table  2.1,  is  for  uO-km  passes. 

See  text  for  P  ( .  1 ) . 

NP  (tenths)  0  1  2_ _  3 _ 4  5  6  7  8  9  10  P(.l) 

F 1  (3.)  0  2  8  17  24  2  1  15  7  2  0  )  64 

F2  (%)  1  3  7  11  19  23  18  11  5  2  0  51 


As  a  safety  measure,  the  applicability  of  the  binomial  distribu¬ 
tion  to  our  cloud  data  was  directly  tested  and  confirmed.  Results  will 
be  illustrated  later  in  this  section. 

The  11x11  matrix  that  forms  the  main  portion  of  Table  K-2A  is 
merely  the  evaluation  of  Eq.  (B-l)  for  n  =  10  and  for  all  integral 
lOths  of  NA.  Each  row  states  the  f requoncy  distribution  o'  same'e 
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cloud  fraction,  NT,  corresponding  to  the  particular  value  of  areal  cloud 
fraction,  NA.  For  convenience,  the  principal  diagonal  has  been  lined  in. 

We  define  an  index  of  sampling  accuracy,  denoted  P(.l),  as  the  fre¬ 
quency  of  cases  for  which  sample  and  ureal  cloud  fraction  agree  to  within 
1/10  —  i.e.,  |  NP  -  NA  |  -  1.  Thus,  for  each  row  in  the  matrix,  P(.l) 

is  the  sum  of  the  frequency  on  the  diagonal  and  the  two  horizontally 
! lankinq  values,  except  that  there  is  but  one  flanking  value  when  NA  =  0 
or  1.  P(.l)  is  tabulated  in  the  final  column  of  Table  B-2A.  It  is 
symmetric  about  NA  =  5 ,  where  it  is  at  minimum.  This  illustrates  a  gen¬ 
eral  truth,  namely,  that  cloud  sampling  is  commonly  least  accurate  when 
the  area  is  half  clouded. 

Another  way  to  describe  the  basic  matrix  is  that  it  embodies  the 
conditional  frequency  of  sample  cloud  fraction,  given  the  areal  cloud 
fraction.  But  the  contract  Statement  of  Work  poses  the  converse  ques¬ 
tion:  given  the  sample  cloud  fraction,  what  is  the  frequency  distribu¬ 
tion  of  areal  fraction?  It  is  possible  to  transform  between  the  two 
formulations  with  the  aid  of  one  of  Bayes'  rules,  which  will  now  be 
derived. 

Consider  the  joint  frequency  of  NA  and  NP,  P(NA,NP)  —  i.e.,  the 
frequency  of  the  simultaneous  occurrence  of  specified  values  of  areal 
ami  sample  cloud  fractions.  This  can  be  evaluated  as  the  product  of 
the  conditional  frequency  of  NP,  given  NA ,  and  the  unconditional  fre¬ 
quency  of  NA: 

P (NA , NP)  =  P(NP|NA)  x  P (NA) .  (B-2) 

Conversely,  the  joint  frequency  of  NP  and  NA  can  be  expressed  as 

P ( NP , NA)  =  r(NA|NP)  X  P (NP)  .  (  B - J ) 

But,  by  definition,  P(NA,NP)  =  P(NP,NA),  and  from  F.qs.  (B-2)  and 
(B- <)  :f  follows  that 

P(NA|NP)  -  P (NP | NA)  x  P (NA) /P (NP) .  (B-4) 

file  conditional  frequency  on  the  lefthand  side  of  bq .  (B-4)  is  the 

answer  to  the  sow's  question.  The  first  factor  on  the  riqhthand 
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TABLE  B-2  CONDITIONAL  FREQUENCY  (%)  OF  CLOUD  FRACTION  (TENTHS) , 


BASED  ON  SAMPLE  OF  10  RANDOM  POINTS. 

A.  Given  the  Areal  Fraction  (NA)  . 

B.  Given  the  Sample  Fraction  (NP) . 
Sue  text  for  other  details. 
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TABLE  B-3  SAME  AS  TABLE  R-2R,  EXCEPT  FOR  CLOUD  CLIMATOLOGY  <  V 
FT.  RUCKER,  AL,  IN  JULY. 
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aide  is  just  the  conditional  frequency  in  Table  B-2A.  If  we  assume 
that  the  sampling  takes  place  randomly  in  time  or  on  a  day-to-day  basis, 
then  the  unconditional  frequency  of  NA  is  nothing  but  the  climatic 
frequency,  which  is  knowable.  But  what  about  the  unconditional  frequency 
of  sample  fraction,  P(NP)?  This  can  be  deduced  by  recognizing  that  a 
sample  fraction  occurs  only  in  conjunction  with  some  area  fraction.  Thus, 
the  unconditional  frequency  of  a  particular  value  of  NP  is  the  sum  of  its 
joint  frequency  with  all  possible  values  on  NA.  In  other  words, 

y 

P (NP)  =  Nft  P(NP|NA)  x  P (NA)  (B-5) 

and,  finally, 

I  .  P(NP|NA)  x  P (NA) 

P(NA|NP)  =  — - 1 - - -  .  (B-6) 

£  P(Np|NA)  X  P (NA) 

NA 

Not <  that  P(NA|NP)  is  dependent  on  the  climatic  frequency  of  the 
areal  fraction.  This  climatic  frequency,  P(NA) ,  is  tabulated  in  the 
next-to-last  column  of  Table  B-2A.  The  particular  distribution  is  for 
cloudiness  in  the  altitude  range  (1-3  Kft  at  Fulda,  FRG ,  in  January, 
1200-1400  LST.  These  values  were  derived  by  Lund  using  a  model  devel¬ 
oped  by  Gringorten.3 

Table  B-2B  uses  this  P(NA)  to  construct  from  Eq.  (B-6)  a  matrix 
that  is  equivalent  to  the  A-matrix  except  that  now  the  sampled  cloud 
f ruction,  rather  than  the  area  fraction,  is  the  predictor.  Aqain,  the 
last  column  tallies  sampling  accuracy,  P(.l),  as  a  function  of  sample 
cloud  fraction. 

To  show  how  this  second  matrix  depends  on  climatology.  Table  B-3 
was  evaluated  for  the  areal  cloud  frequency  at  Ft.  Rucker,  AL,  in  July. 
This  is  the  B-matrix  only,  for  the  A-matrix  depends  only  on  the  number 
of  points  in  the  sample  and  is,  therefore,  the  same  for  the  two  cases. 

The  climatic  frequencies  used  for  Table  B-3  are  listed  in  Tat ’ e  B-4. 

*  As  previously  noted,  because  of  roundoff,  the  sum  of  frequencies 
is  not  always  precisely  100'.'.. 

3.  Orirqorten,  I. I.,  19431 :  Climatic  probabi 1 i t ies  of  the  vertical 
distribution  of  cloud  cover.  AFGL-TN  (in  press). 


b-mat rices  foi  a  10-point  i  ancjom  sample  were  also  cons t  t  url . ■<!  'o r 


the  8  additional  cloud  climatologies  listed  in  Table  K-4 ,  but  they 
will  not  be  reproduced  in  detail  here.  Instead,  they  will  be  summa¬ 
rized  below. 

Althouqh  the  C.rinqorten  technique  was  not  designed  to  treat  layers 
of  zero-thickness,  it  can  be  pushed  formally  to  this  limit,  which  is 
the  "level"  of  our  sampling  problem.  Consequently,  evaluations  tor  a 
mid-level  accompany  each  layer  in  Table  B-4. 

In  Table  B-4,  "Mean"  is  the  average  cloudiness.  "U"  is  an  index 
formed  by  summing  the  frequencies  for  NA  =  0  and  NA  =  1 .  It  is  a  crude 
measure  of  the  "U-shapedness"  of  the  frequency  distribution,  and  its 
significance  will  become  evident  shortly.  "Rank”  is  the  order  ol  the 
tabulated  frequency  distributions  in  terms  of  the  U-index,  with  1  as¬ 
signed  to  the  lowest  value  of  U.  Note  that,  as  expectable,  the  distri¬ 
bution  for  a  level  is  consistently  more  U-shaped  than  that  of  the  em¬ 
bedding  layer. 


TABLE  B-4  CLIMATOLOGICAL  FREQUENCY  ( A)  OF  AREAL  CLOUDINESS . 

Ft.  Rucker  is  from  the  Uniform  Summary  of  Weather 
Observations . 
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ilow  i'j.l)  varies  with  cloud  amount  is  interesting,  Lut  to  facilitate 
com| 'orison  ot  sampling  strategies  in  terms  of  accuracy,  it  would  be  more 
useful  to  have  some  sort  of  lumped  value.  However,  a  simple  average  of 
i’(.l)  stands  to  be  misleading.  Because  P(.l)  is  a  significant  function 
of  cloud  fraction,  a  straight  average  would  not  correctly  depict  the 
accuracy  that  could  be  expected  from  the  particular  strat.eqy  if  it  were 
to  be  used  randomly  in  time  or  on  a  day-to-day  basis.  For  a  proper 
estimate  of  this  expectation,  the  average  of  l’(.l)  must  be  weighted  by 
the  climatological  cloud  frequency.  This  means,  of  course,  that  the  ex¬ 
pected  sampling  accuracy  varies  with  location,  season,  time  of  day,  alti¬ 
tude,  etc.,  as  well  as  with  size  of  sample. 

In  Table  B-5,  "U"  is  the  shape  index  mentioned  earlier,  and  several 
averages  of  sampling  accuracy  are  presented  for  each  of  the  10  climatol¬ 
ogies  in  Table  B-4.  P'(.l)  refers  to  the  accuracy  index  when  areal  frac¬ 

tion  is  the  predictor.  P(.l)  is  the  same  measure  when  the  point-sample 
fraction  is  the  predictor.  Without  a  superscript  w,  the  average  is  un¬ 
weighted.  Note  that  the  unweighted  average  of  P'(.l>  is  the  same  for 
all  10  cases.  This  is  as  it  should  be,  because  P'(.l)  depends  only  on 
the  number  of  points  in  the  random  sample,  10  here,  and  not  at  all  on 
the  cl imatoloqical  cloud  frequency.  Note  that  P(.l)  ,  the  accuracy  index 
when  the  sample  fraction  is  used  as  a  predictor,  is  slightly  lower  on 
average  than  P'(.l).  On  the  other  hand,  the  weighted  mean  is  invariably 
and  significantly  larger  than  either  of  the  unweighted  values.  It  is 
this  weighted  value  that  is  the  meaningful  estimate  of  averaqe  accuracy 
expected  in  practice. 

Strictly  speakinq,  the  two  distributions  for  Ft.  Rucker  are  inap¬ 
propriate  here  because  they  refer  to  total  sky  cover.  The  other  8  dis¬ 
tributions  were  custom-tailored  to  represent  conditions  in  a  layer  or 
at  a  level,  which  is  what  our  scenario  calls  for.  The  justification 

for  introducing  these  "alien”  data  is  that  they  extend  the  range  of  U 

—  - 

available  for  Figure  B-l,  which  shows  a  clean  relation  between  P(.l) 
and  U.  The  lowest  point  corresponds  to  P'(.l),  which  may  be  viewed  as 
the  weighted  mean  for  a  uniform  distribution.  For  such  a  distribution 
U  •-  18%. 

In  Table  B-S  there  is  no  c 1 i mat o logica 1 1 y  weighted  average  for 

P'(.l).  The  reason  for  this  is  that  it  is  necessarily  identical  to 
— — — 

P  ( .  1)  .  In  forming  ari  average,  the  same  matrix  elements  are  summed 
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whether  NA  or  NP  is  the  predictor  —  namely  all  elements  for  whi'-ti 
; NA-NP  <  .  Basically  the  values  in  those  posi  lions  differ  between  tin 

A-  and  B-matrices,  but  when  weighted  by  the  a;,  rot  r i ate  climatological 
frequency,  what  is  summed  is  terms  of  the  form  P(n|a)  x  P(A)  or 
P {A | P)  x  P(P).  In  either  case,  the  result  is  P(A,P).  In  short,  the 
climatologically  weighted  mean  of  our  index  of  sampling  accuracy  is 
simply  the  sum  of  the  joint  frequency  of  NA  and  NP  taken  over  a  i-element 
wide  band  along  the  principal  diagonal. 

TABLE  B-5  AVERAGED  SAMPLING  ACCURACY  (%)  FOR  EACH  OF  10 

CLIMATOLOGICAL  CLOUD  FREQUENCIES. 

Based  on  binomial  distribution  for  n  =  10. 

See  text  for  significance  of  the  several  values 
of  P(.l) 


STATION 

U  P'(.l) 

p(.d 

M.l)w 

Ft.  Rucker 

Jul 

37  79.9 

79.4 

84.8 

It 

Oct 

63 

81.0 

91.6 

Fulda 

0-3  kft 

71 

80.2 

92.9 

II 

2  kft 

89 

80.6 

97.5 

» 

6-10  kft 

70 

80.5 

92.8 

II 

8  kft 

8  3 

80.0 

96.  2 

Adana/Incirlik 

0-3  kft 

87 

71.7 

97.3 

" 

2  kft 

94 

65.1 

98.7 

tl 

6-10  kft 

68 

79.5 

93.4 

II 

8  kft 

72 

79.5 

94.  3 

Mean 

79.9 

77.7 

93.9 

RMS  A 

5. 

39 

17.9 

Mean  | A | 

2. 

71 

16.2 

As  noted 

earlier,  an 

acid  test  was 

conducted 

to  vet  i  !  y  th.il  t  In¬ 

mean  of  random 

point  samples  from  one  of  our  cloud 

ti  elds  is  d  i  -it  r  i  but  ed 

binomially.  The  test  was 

conducted  separately  for 

N  = 

5,  10,  2d,  and  id 

The  procedure 

will  be  illustrated  here 

for 

the  case  of 

N  10. 

Ten  points  are  randomly  positioned 

i  n 

OIK'  o! 

t  he  1 

OOx  1  (10  km  load 

fields.  NOP, 

the  number 

of  points  l a  1 1 

inq 

wi  1 1)  i  n 

cloud,  is.  counted. 

This  constitutes  1  trial.  The  trial  is  repeated  tor  a  total  ot  100,  and 
the  frequency  distribution  of  NO!‘  is  evaluated.  This  is  the  experimental 
distribution  for  the  particular  case.  To  match  it,  the  binomial  distri¬ 
bution  is  derived  for  the  observed  value  of  NA,  the  areal  cloud  fraction, 
and  for  N  =  10. 

All  82  cases  in  the  development  sample  ot  cloud  fields  are  processed 
in  this  manner.  The  resulting  82  distributions  of  Not  are  grouped  by 
value  of  NA  (rounded  to  tenths)  and  averaged.  The  upshot  is  an  11x11 
matrix  of  the  observed  conditional  frequency  of  NOP,  given  NA.  (In  the 
general  case  this  matrix  is  11  x  (Ntl).)  A  matrix  of  theoretical  fre¬ 
quencies  is  generated  by  performing  the  same  operations  on  the  binomial 
distributions  for  the  82  cases.  The  sampling  accuracy,  t'(.l),  is  eval¬ 
uated  for  each  row  in  both  matrices. 

The  differences  between  corres[>ondinq  elements  of  the  two  matrices 
are  shown  in  Table  B-6.  The  close  agreement  is  immediately  evident, 
considering  that  the  row-sum  for  both  ma- vices  is  100  (within  roundoff). 
For  more  detailed  inspection,  the  theoretical  matrix  here  is  almost 
identical  to  that  shown  in  Table  B-2A,  the  difference  being  that  the 
rows  here  are  averages  of  distributions  based  on  generally  non-integral 
values  of  NA. 

The  last  column  in  Table  B-6  shows  that,  not  suprisingly ,  the  agree¬ 
ment  in  P(.l)  between  observed  and  theoretical  distributions  is  also  very 
good. 

TABLE  B-6  DIFFERENCE  BETWEEN  FREQUENCIES  OF  OBSERVED  RANDOM 
SAMPLES  AND  BINOMIAL  DISTRIBUTION  1-VR  N  10. 

NA  and  NP  are  mi  tenths;  all  other  values  are  in 

NA  NP  0  I  2  3  4  5  6  7  8  9  10  P(.l) 

0  1000000000  0  1 

1  0  -1  00000000  0  -1 

2  1  -2  2  -1  1  10000  0  -1 

3  -1  1  3  -1  -1  0  -1  0  0  0  0  1 

4  -1  -2  -1  0  1  0  0  0  1  0  0  1 

5  0  0  0  1  -2  0  0  0  1  0  0  -2 

6  0000010100  0  1 

7  00000  -1  0  -3  11  0  -2 

H  0  0  0  0  0  11  -3  -2  2  -1  -3 

0  O  0  0  0  0  0  0  -1  2  -1  1  2 

10  0  0  0  0  0  0  0  0  0  1-1  0 
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Similarly,  we  found  excellent  aqreement  between  observed  and  theo¬ 
retical  frequencies  for  the  other  values  of  N  tested:  5,  20,  30.  Indeed, 
the  aqreement  was  even  better  for  the  larqer  values  of  N.  Consequently, 
we  conclude  without  reservation  that  random  point  samples  taken  within 
cloud  fields  do  follow  the  binomial  distribution. 
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APPENDIX  C 


ESTIMATING  THE  INDEPENDENCE  FRACTION 


If  the  cloud  sensor  is  sampled  at  1  Hz,  a  pass  across  the  target 
area  produces  more  than  a  thousand  points  at  intervals  of  less  than 
50  meters.  How  good  a  predictor  of  the  areal  cloud  cover  is  the  cloud 
fraction  of  these  points?  As  outlined  in  Appendix  B,  this  question 
could  be  readily  answered  in  terms  of  the  binomial  distribution  if  the 
points  were  independent,  but  samples  so  closely  spaced  are  highly 
correlated.  This  interdependence  inflates  the  variance  of  the  point 
fraction  and  reduces  the  accuracy  of  the  pass  mean  cloudiness  as  an 
index  of  area  cloud  cover.  In  terms  of  this  variance,  the  pass  behaves 
as  though  composed  of  some  smaller  number  of  independent  points.  Usinq 
the  procedure  described  below,  for  each  of  the  82  basic  cloud  fields 
we  evaluated  this  reduced  number,  N',  and  the  "Independence  Fraction," 
IF,  defined  as  N'/N  where  N  is  the  actual  number  of  points  in  the 
sample . 

2 

For  each  case  the  variance,  0  .  of  the  half-row  cloud  fraction 

N 

was  directly  evaluated  for  the  100x100  km  area.  On  the  average,  these 

2 

fractions  were  based  on  62  points  separated  by  .82  km,  and  o„  was 

N 

based  on  166  values. 

Since  the  point  value  ' s  either  0  or  1,  the  mean  and  mean  square 

of  any  collective  of  points  are  identical.  Consequently,  considering 

the  10,000-plus  points  in  the  100x100  km  cloud  field  as  the  entire 

population,  their  variance  is  NA(l-NA)  where  NA  is  the  cloud  fraction 

for  the  whole  area  which  is,  of  course,  the  population  mean.  It  is 

2 

known  that  the  variance,  o^, ,  of  the  mean  of  any  subset  consisting  of 
N'  independent  samples  is  1/N’  times  the  point  variance.  We  now  ask 
in  each  case:  what  value  of  N'  yields  the  same  variance  as  that  ob¬ 
served  in  the  half-rows?  This  generates  the  following  formulation  for 
the  Independence  Fraction: 


IF  =  N'/N  - 


NA(l-NA) 


N  X  0 


2  ‘ 

N 


(C-l) 


Usinq  the  half-rows  and  Eg.  (C-l),  wc  evaluated  IF  lot  each  <>i 
the  82  cases.  The  average  value  ol  IF  was  0.090  with  a  standard 
deviation  of  0.054.  In  other  words,  in  terms  of  their  variance,  the 


half-row  cloud  fractions  behaved  as  though  based  on  fewer  than  6  points 
(.090  x  62  =  5.8) . 


An  analogous  concept  is  that  of  "Independence  Unit  of  Length,"  iu, 
which  is  the  average  separation  of  the  hypothetical  N  independent  sam¬ 
ples.  Since  the  half-rows  are  50  km  long  in  all  of  our  cases,  IU  = 
50/N'  km.  For  all  82  cases  the  average  of  IU  is  12.33  km  with  a  stan¬ 
dard  deviation  of  6.7  km. 

Normally,  the  procedure  above  cannot  be  used  in  practice  for  want 
of  a  sufficient  number  of  passes  from  which  to  compute  the  variance  of 
the  pass  mean.  Hence  we  derived  and  tested  several  formulations  for 
approximating  IF  from  single-pass  data.  All  of  these  entailed  evalu¬ 
ating  autocorrelation  functions,  and  all  resulted  in  values  of  IF  that 
were,  on  average,  larger  than  the  directly  derived  values  discussed 
above . 

Since  several  of  these  approximations  of  IF  assume  that  the  se¬ 
quence  of  samples  forms  a  Markov  chain,  we  tested  this  assumption  by 
evaluating  the  "Markov  multiplier"  for  lags  2-6.  The  data  used  were 
autocorrelations  for  i  =  1-6,  based  on  entire  rows,  which  average  124 
points  in  length.  For  each  lag  the  autocorrelation  coefficient  was 

averaged  for  the  approximately  83  rows  in  each  case.  The  Markov 

l 

multiplier  is  defined  as  M^  =  P^/P^  where  £  is  the  lag.  These  mul¬ 
tiples  were  then  averaged  over  the  82  cases.  The  results  are  plotted 
in  Figure  C-l.  For  a  Markov  sequence,  M^  =  1  for  all  values  of  £. 
Figure  C-l  shows  that  in  our  cloud  samples,  the  autocorrelation  func¬ 
tion  decays  more  slowly  than  Markovian. 
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SOME  REFLECTIONS  ON  HOMOGENEITY 

The  contract  Statement  of  Work  invited  us,  first,  to  work  the  over¬ 
all  problem  assuming  horizontal  homogeneity  of  the  cloud  field  and,  then, 
tu  consider  and  evaluate  the  effects  of  non-homogeneity .  As  noted  in 
the  overview,  we  found  it  from  the  outset  unnecessary  to  limit  our 
purview  to  homogeneous  cloud  fields.  The  theoretical  approach  made  no 
stipulation  about  the  cloud  field,  only  that  the  sampling  points  be  ran¬ 
domly  deployed.  The  experimental  approach  was  based  on  cloud  fields  as 
nature  served  them  up.  Nevertheless,  we  have  given  some  thought  to  the 
pleasures  of  sampling  a  homogeneous  field. 

What  is  homogeneity?  The  Statement  of  Work  gives  no  clue.  As  a 
realization  of  a  horizontally  homogeneous  cloud  field,  one  might  well 
visualize  a  field  of  fair-weather  cumulus,  of  size  and  spacing  that  are 
variable  but  not  too  much  so,  and  without  any  mesoscale  structure.  Any 
sampling  pass  made  through  such  a  field  —  provided  that  it  is  sufficiently 
long  relative  to  the  "scale  size"  of  the  field  —  should,  in  a  statistical 
sense,  be  equivalent  to  any  other  pass.  The  most  primitive  statistic  is 
the  mean  cloudiness  along  the  path.  Thus,  we  are  led  to  the  definition 
that  a  cloud  field  is  homogeneous  if  and  only  if  all  sampling  passes  of 
sufficient  length  yield  the  same  value  of  cloud  fraction.  Under  such  con¬ 
ditions,  the  sampling  problem  is  trivial.  A  single  horizontal  pass  of 
sufficient  length  yields  a  flawless  estimate  of  cloud  fraction  for  the 
area. 

While  all  horizontally  homogeneous  cloud  fields  are,  thus,  equally 
easy  to  sample,  it  does  not  follow  that  all  inhomogeneous  fields  are  equally 
difficult  to  sample.  There  is  one,  not-uncommon ,  situation  that  is  partic¬ 
ularly  troublesome.  The  field  we  describe  is  the  one  we  call  the  "cloud 
liont"  field,  which  is  shown  in  Figure  D-l.  Given  an  area  of  interest, 
which  we  take  to  be  a  square,  we  break  it  into  exactly  two  pieces,  one 
overcast  and  one  clear.  For  further  simplification,  we  assume  that  each 
of  the  two  pieces  is  a  rectangle,  the  "cloud  front"  being  the  line  between 
them. 


Cloud  Front 


Lenqth 
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Figure  D-l.  Geometry  of  the  Cloud  Front  Sample. 


We  now  perform  straight-line  sampling  of  this  cloud  field.  We  shall 
assume  that: 

(1)  the  square  is  of  lenqth  2- 

(2)  the  sample  line  of  fixed  lenqth  I,  (i  2)  is  centered  in 

the  square,  and  takes  no  angular  preference; 

(3)  without  loss  of  generality,  the  areal  cloud  cover  NA  is 

>  4. 

Let  the  fraction  of  our  straight-line  sample  that  is  in  cloud  he  Nl.. 
Some  consideration  shows  that  in  all  cases  NA  £  NL!  (If  NA  <  then 
NA  2  NL  always.)  The  question  to  be  answered  is  this:  witli  all  areal 
cloud  covers  NA  being  equally  likely,  find  the  probability  l’(.l)  that  NL 
is  within  . 1  of  NA. 

A  short  calculation  shows  that 

Pr  (  |na-NI,|  <  .  .  |  NA)  =  cos"1  (minjl,  }  >•  <D-^ 


We  can  thus  write 

P(.l)  -  ypr  (  |NA  -  NI,[^.(|nA)  d  (NA)  ,  and 


2  .1  -1,1,  2x-l  i  ,  , 

T/  cos  (mHJ'  T7uT74;i  >  dX- 


p(.l)  = 


(D-2) 


Wo  expect  tli.tt  .1*.  I.  i  tin  ojjk'n  I  rum  O  to  ,  t'i.  1)  increases  as  well. 
We  have  evaluated  Kg.  (l>-2)  tor  a  succession  of  eat  n  lengths  obtairi- 

inq  the  qraph  shown  in  Fiqure  D-2. 

The  "cloud  front"  field  is  probably  the  most  extreme  example  of  an 
i nhomoqeneous  field;  yet  it  certainly  occurs  with  no  small  probability. 
If  such  a  field  is  considered  likely  on  u  given  day.  Figure  D-2  shows 
that  even  with  a  straight  line  pass  equal  in  lenqth  to  the  scale  of  the 
region  in  question  (b  -  2),  the  probability  of  obtaining  an  accurate 
sample  is  just  over  10%.  We  note  finally  that  if  instead  of  sampling 
along  a  straight  line  an  equal-area  point  sample  rs  taken,  the  result¬ 
ing  l'(.l)  is  larger  and  is  equal  to  60%  for  the  9-point  sample,  80%  for 
the  16-point  sample. 
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A  NOVEL  METHOD  FOR  EVALUATING  THE  ONE-LAG  AUTOCORRELATION 

In  view  of  the  simple,  dichotomous  nature  of  cloud  observat ions 
if  is  possible  to  derive  the  one-laq  autocorrelation  in  an  extremely 
. impLe  lorm;  namely. 


Px  =  1  -  g/nNL ( 1  -  NL)  (E-l) 

where  g  is  the  number  of  separate  cloud  groups  in  the  series  of  obser¬ 
vations  of  zeros  (clear)  and  ones  (cloud) .  Therefore  g  is  the  number 
of  discrete  strings  of  ones.  The  remaining  terms  have  their  usual 
meaning:  n  is  the  total  number  of  observations  in  the  series  and  NL 

is  the  fraction  which  are  ones. 

Eg.  (E-l)  can  be  derived  by  means  of  two  separate  approaches. 

One  makes  use  of  probability  concepts  and  the  other  consists  essentially 
of  direct  substitution  into  the  defining  expression  for  the  one-lag 
autocorrelation.  Both  approaches  will  be  illustrated  since  they  each 
otter  somewhat  different  insights  into  the  nature  of  Eg.  (E-l). 

We  have  a  series  of  uniformly  spaced  observations  along  a  straight, 
horizontal  line  whose  elements  consist  of  zero  (clear)  or  one  (cloud). 

Let  P (0 | 1)  be  the  conditional  probability  of  a  zero  given  a  one 
as  the  antecedent  observation.  Let  P  ( 1 1 1 )  be  the  conditional  probabil¬ 
ity  of  a  one  given  a  one  as  the  antecedent  observation.  Then  it  is 
apparent  that 

P  ( 1 1  1)  =  1  -  P<o|l).  (E-2) 

P(l|l)  times  the  number  of  ones  in  the  series  is  merely  the  sum 
of  the  one-lagged  products.  That  is, 

n  NL  P  (1 1  1)  =  EViyi  +  l  (E_3) 

where  y^  is  the  i^'  observation.  Thus, 

nl  Pd|  i)  -  y\yi+y  (E— 4 ) 

where  the  operation  (  )  indicates  an  uveraqe  over  the  domain. 

From  the  definition  of  the  one-lag  autocorrelation,  , 


LG 


p F 


Pl  =  (yiyiU  -  yiyi  +  l)/0y  °y 

1  i  t 


(K-b) 


u  2  2 

where  a  =  o 
yi  y 


i  +  1 


Oy  is  the  variance  ol  the  set  of  ubserv.it  ion:;, 


we  obtain  y.y.  , 
ii+l 


2  -2 
P,o  +  y 
1  y 


/  2  ~2,  ~~ 
fj  (y  -  y  )  f  y 


_  _2  —2  2 
=  Pj  -  (y  -  y  )  +  y  (mi.  -  [ni]  )  +  (nd 


Therefore,  from  Eq.  (E-4) 


P(ltl)  =  (1  -  NL)p1  +  NL  =  0i  +  NL  (1  -  p  ).  (E-6) 

However,  P ( 0 | 1 )  may  be  expressed  as 

P ( 0 | 1 )  -  22  f (x)  /  22  xf(x)  (E-7) 

x  x 

where  f(x)  is  the  frequency  of  a  string  of  x  ones.  Therefore  £2  f (x) 

x 

is  the  number  of  separate  cloud  groups  (g)  ,  and  22  x,:(x)  -  n  NL. 

The  latter  quantity  is  of  course  the  total  number  o*  ones  in  the  series 
of  observations.  Thus  we  have. 


P(0| 1)  =  q/n  NL 

but  from  Eqs.  (E-2)  and  (E-6)  we  have 


(i:-8) 


P  ( 0 1  1)  =  1  -  PL  -  NL ( 1  -  0^  =  (1  -  p  )  (1  -  NL)  . 

Therefore,  we  have 

Pj  =  1  -  q/n  N1, ( 1  -  NL)  .  (!•;-]  ) 

The  above  expression  for  the  one-lag  autocorrelation  can  be  derived 
directly  from  the  standard  definition  of  g  Eg.  (k-5) .  If  we  are  care¬ 
ful  to  define  the  domain  of  the  operator  (  )  so  as  to  account  for  end 

effects,  then  the  domain  of  i  in  Eg.  (K-h)  is  i  -  1,  2,  ...,  n-1. 

In  terms  of  quantities  which  have  already  been  defined, 


1.7 


(nNi.  -  q)/n 


(E-'J) 


P 


YiYi.l 


y  -  NL 
i+1 


( E-10) 

(E-ll) 


Substituting  Eq.  (K— ’>)  ,  (E-10)  and  (E-ll)  into  Eq.  (E-5)  we  obtain 
Eq.  vE-1 )  direct  Ly , 


P 


1 


(nNL  -  q)  /  n  -  (NL) 


i  -  g/nNL  (1  -  NL)  . 


In  Kqs.  (E-9)  ,  (E-10)  ,  and  (11-11)  the  expressions  are  indicated  as 
approximate .  The  approximation  arises  because  the  domain  of  the  (  ) 

operator  is  i  =  1,  2,  n-1,  whereas  in  Eq.  (E-9)  we  have  divided  by 

n.  Similarly,  in  Eqs.  (E-10)  and  (E-ll)  NL  is  the  cloud  fraction  for 
the  entire  series  of  n  observations,  whereas  y^  and  y^+1  are  averages 
over  only  (n-1)  observations.  The  approximation  however  is  certainly 
very  qood  for  all  but  extremely  short  series  of  observations. 

Therefore,  in  any  series,  where  the  elements  may  be  expressed  as 
zeros  or  ones,  the  simple  relation  for  the  one-lag  autocorrelation. 

Eg.  ( E-l ) ,  may  find  useful  appl ication . 
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APPENDIX  F 


ADDITIONAL  THEORETICAL  SAMPLING  DISTRIBUTIONS 

Here  we  present  some  simple  sampling  models  examined  during  the 
course  of  this  contract.  These  models  have  assisted  our  understandinq 
and  may  possibly  be  of  value  in  future  related  studies. 

Because  many  atmospheric  processes  are  known  to  be  Markovian  in 
nature  we  early  on  made  a  simple  Markov  model  and  studied  its  character¬ 
istics.  While  the  empirical  data  we  later  collected  throuqh  Me  I  DA:-' 
appear  to  be  samples  of  a  sub-Markovi an  process,  nevertheless  the 
Markov  model  is  a  reasonable  one  and  does  provide  insight.  We  proceed 
to  outline  its  structure. 

Let|x^|,  n  =  1,  2,  3,  ...,  represent  the  sequence  of  zeros  and  ones 
returned  by  a  straiqht  line  pass  of  the  APV.  If  the  areal  cloud  frac¬ 
tion  is  NA  then  for  any  i  the  probability  that  X.  =  1  is  NA  and  the 
probability  that  X.  is  0  is  1  -  NA.  We  wish  to  introduce  the  notion  of 

persistence  into  our  model.  Assume  that  the  first  N  values  of  the  se¬ 

quence  are  known.  We  model  the  conditional  probabilities  of  X^+3  as 
follows,  introducing  persistence  through  the  parameter  a,  with 
0  <;  a  £  1: 

pr  (XN+1  =  1  |  XN  =  1)  =  a  +  (l-a)NA 

Pr  <XN+1  =  0  |  XN  =  1)  =  (l-a)d-NA)  (F_1} 

Pr  (X  =  1  |  X  =  0)  =  (l-a)NA 

N+l  N 

Pr  (X  ,  =  0  I  X  =0)=a  +  (1-a)  ( 1-NA)  . 

N+I  n 

st 

It  is  possible  to  think  of  sequences  whose  (N+l) —  values  are 
dependent  upon  all  of  the  valuesjx^j,  n  =  1,  2,  ...,  N.  The  fact  that 
the  probability  law  for  X^+^  in  our  model  Eq.  (I’-l)  depends  only  on  the 
last  value  X^  makes  our  process  Markovian.  It  should  be  clear  that  once 
the  value  for  is  given,  Eq.  (l'-J)  inductively  determines  the  probabil¬ 
ity  law  for  X^,  X3,  X4,  ...  To  start  the  process  it  is  natural  to  take 
Pr  (Xj  =  1)  =  NA  and  Pr  (X1  =  0)  =  1  -  NA. 
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We  have  now  completely  specified  our  (Markovian)  model  Eq.  (F-l). 

It  is  not  hard  to  show  that  for  any  value  of  N ,  the  expected  value  of 

XN  is  NA.  further,  t ho  one-lag  autocorrelation  of  the  process  is  <i. 

because  the  one-laq  autocorrelation  is  called  p  in  earlier  sections 

we  shall  here  make  the  identification  u  -  p.  It  is  not  hard  to  see 

that  if  a  =  0  then  the  X^'s  are  completely  independent.  As  ex  increases 

toward  1  the  effect  of  X  on  X  increases,  until  in  the  extreme  case 

N  N+l 

a  =  1  all  of  the  X  *s  are  equal  to  the  value  of  X,.  It  is  for  this 
N  i 

reason  that  a  is  considered  to  be  a  measure  of  persistence. 

In  operational  practice  a  sample  sequence  of  zeros  and  ones  is 

provided  by  the  APV ,  and  we  wish  to  draw  inferences  from  it.  Before  we 

can  do  so,  however,  we  need  to  know  more  about  the  model  Eq.  (F-l) ;  in 

particular  we  ask  first:  given  that  we  know  both  NA  and  p  (=a)  and 

that  we  make  a  total  of  N  observations,  what  is  the  probability  that 

exactly  k  of  our  observations  are  equal  to  one?  Let  us  write  this 

probability  as  s(k;  NA,p).  We  have  written  a  program  called  PROB 
N 

which  calculates  S.  It  is  worthwhile  to  note  that 

V  S(k;  NA,  p)  =  1  (F-2) 

K=0 

NS(k;  NA,  0)  =  (^)  NAk  (l-NA)N_k.  (F-3) 

Eq.  (K-3)  is  a  statement  of  the  fact  that  in  the  limiting  case  p  =  0 
our  model  reduces  to  a  sequence  of  N  independent  trials  of  the  bi¬ 
nomial  distribution  with  parameter  NA. 

For  the  qeneral  case  p  j-  0  it  is  clear  that  persistence  makes  our 
sample  of  N  observations  dependent  on  one  another.  We  ask  as  in  pre¬ 
vious  sections:  find  an  appropriate  measure  of  the  amount  of  independ¬ 
ent  information  contained  in  any  sample  sequence.  We  make  this  question 
more  explicit:  using  the  chi-square  test,  find  the  value  of  N'  so  that 

S (k ;  NA,  0)  best  fits  S(k;  NA,  p) . 

N  N 

The  following  table  gives  the  results  for  the  case  N  =  100;  for 
each  (NA,  p)  pair,  the  entry  is  the  value  of  N': 
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TABLE  F- 

1. 

N'  AS 

A  FUNCTION  OF 

NA  AND 

P  (N 

100)  . 

NA 

.1 

.2 

.  3 

.4 

.5 

.  6 

.7 

.8 

.9 

P 

.1 

.2 

>50 

>50 

>50 

.  3 

.4 

44 

43 

43 

43 

43 

43 

43 

43'i 

44 

.5 

35 

34 

34 

34 

34 

34 

34 

34 

35 

.6 

25 

26 

25 

25 

25 

25 

26 

26 

25 

.7 

15 

18 

18 

18 

IK 

18 

18 

18 

15 

.8 

14 

10 

11 

12 

12 

12 

12 

10 

14 

.9 

17*5 

8 

5 

4 

4 

4 

4 

8 

17' 

We  now  turn  the  problem  around;  instead  of  a  probability  law  beinq 

given,  a  sample  of  N  observations  is  considered  given.  In  addition,  the 

climatology  function  CP (£),  £  =  0,  1,  2,  ...,  10  =  the  climatological 

£/ 

probability  of  the  cloud  cover  being  10  is  assumed  to  be  given.  From 
the  sample  we  can  obtain  an  N'  in  various  ways,  including,  for  example, 
using  the  above  table.  As  in  Appendix  B,  Bayes'  Theorem  then  provides 
the  conditional  probabilities  Pr  (Na)nl)  =  the  probability  that  the  actual 
cloud  cover  is  NA  given  the  observed  cloud  fraction  NL.  The  computer 
program  NPRIME  performs  this  calculation  for  a  given  climatology,  N', 
and  observation,  NL.  From  this  output  confidence  levels  may  be  obtained. 

There  is  another  (related)  way  to  use  Bayes'  Theorem  without  calcu¬ 
lating  an  N' .  Given  a  sample  of  N  total  observations,  calculate  the  ob¬ 
served  cloud  fraction  NL  and  RHOSAM,  the  sample  one- lag  autocorrelation. 
Just  as  Bayes'  Theorem  can  be  used  to  work  from|N,S(k;  NA,  0)  j,  na 
=0,  .1,  ...,  1.0  together  with  a  climatology,  it  can  also  be  used  to 


work  from^^Slk;  NA,  RHOSAM)  NA  =  0,  .1,  ...,  1.0  and  a  climatology, 
calculating  as  above  Pr  (NA|NL) .  Program  DPROR  performs  the  calculat ion. 

The  final  section  of  this  appendix  addresses  the  question  of  how  to 
proceed  to  obtain  confidence  levels  from  a  sample  sequence  when  no  clim¬ 
atology  is  available.  One  method  is  to  apply  Gauss*  maximurn  1  i  ke_!  i  hood 
estimator  to  estimate  unknown  parameters.^ 


4.  Larson,  ll.,  1969:  Introduction  to  Probability  Theory  and  Statistical 
Inference.  John  Wiley  &  Sons,  pp  253. 
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Assume  that  we  are  qiven  a  sample  sequence  j X^  |  of  N  observations. 
Proceedinq  as  before,  we  can  reduce  to  N’  independent  Bernoulli  trials 
in  a  variety  of  ways.  If  N’  is  large  enough,  it  can  be  shown  that  if 
we  want  to  find  P  and  P^  such  that 


Pr  <1^  <  NA  <  P2)  =1-3,  where  NA  is  the  (F-4) 


(unknown)  cloud  cover,  then 


P  =  nl - — - 

1  /N~ 
P_  =  NL  + 

2  /Sr 


/nl  (1  -  NL)  and  (F-5) 


/NL  (1  -  NL) ,  where 


is  the  100  (l-S/2)  percentile  of  the  standard  (mean  0,  variance  1)  nor¬ 
mal  distribution  function  and  NL  is  the  observed  cloud  fraction  of  the 
qiven  sample. 

Our  last  technique  combines  the  notion  of  a  Markov  process  with 
Gauss'  maximum  likelihood  estimator. 

Assume  that  we  are  qiven  a  sample  j  |  of  N  observations  of  some 
unknown  Markov  process.  Can  we  find  that  Markov  process  which  is  most 
likely  to  have  generated  our  sample  set  of  observations?  We  have  done 
so,  stating  the  result  below. 

Given  X  ,  let 


o 

o 

u 

=  # 

of 

t  imes 

a 

0 

follows 

a 

0. 

coi 

=  # 

of 

times 

a 

1 

follows 

a 

0. 

( 10 

-  # 

of 

times 

a 

0 

follows 

a 

1. 

cn 

-•=  # 

of 

t  imes 

a 

1 

follows 

a 

1. 

Let 


A 


C 


(' 


10 

VI 


and 


B 


( F  —  8 ) 


Do  f i ne 


Cl,  most  likely  NA, 
RIIOI.  =  most  likely  p. 


(F-9) 


72 


A  calculation  shows  that 


CL  =  B  ■  ■  — 


1  *  A 


A  4-  B  +  2AB  ' 


O'-li’) 


RHOL 


L  -  AB 


(1  +  A)  (  1  +  B) 


(K-  1  1) 


Given  the  sample  jx  |  of  N  observations,  one  can  write  in 
analytical  form  the  probability  function  P  =  P(NA,p)  that  is  the  prob¬ 
ability  of  obtaining  the  sequence  j  |  from  the  Markov  process  (NA,p). 
One  then  finds  through  elementary  calculus  that  P  is  maximized  at 
(CL,  RHOL) ,  yielding  the  above  result.  It  is  not  unreasonable  to  expect 
that  levels  of  confidence  for  this  result  can  be  obtained  from  the  func¬ 
tion  P.  We  have  thus  outlined  a  scheme  whereby  under  the  Markov  assump¬ 
tion  and  with  no  climatology  one  can  pass  from  a  line  sample  to  the  most 
likely  generating  Markov  process.  Work  that  might  be  done  in  the  future 
includes  finding  the  confidence  interval  for  this  result. 


